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I, TOD BEDILION, a citizen of the United States, residing at 132 Winding Way, 
San Carlos, California, declare that: 



1 . I was employed by Incyte Corporation (hereinafter "Incyte") as a Director 
of Corporate Development until May 1 1, 2001. I am currently under contract to be a Consultant 
to Incyte Corporation. 

2. In 1 996, 1 received a Ph.D. degree in Cell, Molecular and Development 
Biology from UCLA. I had previously received, in 1988, a B.S. degree in biology from UCLA. 

Upon my graduation from UCLA, I became, in April 1996, the first employee of 
Synteni, Inc. (hereinafter "Synteni"). I was a Research Director at Synteni from April 1996 until 
Synteni was acquired by Incyte in early 1998. 

I understand that Synteni was founded in 1994 by T. Dari Shalon while he was a 
graduate student at Stanford University. I further understand that Synteni was founded for the 
purpose of commercially exploiting certain "cDNA microarray" technology that was being 
worked on at Stanford in the early to mid-1 990s. That technology, which I will sometimes refer 
to herein as the "Stanford-developed cDNA microarray technology", was the subject of Dr. 
Shalon's doctoral thesis at Stanford. I understand and believe that Dr. P.O. Brown was Dr. 
Shalon's thesis advisor at Stanford. 
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During the period beginning before I was employed by Synteni and ending upon 
its acquisition by Incyte in early 1998, 1 understand Synteni was the exclusive licensee of the 
Stanford-developed cDNA microarray technology, subject to any right that the United States 
government may have with respect to that technology. In early 1998, 1 understand Incyte 
acquired rights under the Stanford-developed cDNA microarray technology as part of its 
acquisition of Synteni. 

I understand that at the time of the commencement of my employment at Synteni 
in April 1996, Synteni's rights with respect to the Stanford-developed cDNA technology included 
rights under a United States patent application that had been filed June 7, 1995 in the names of 
Drs. Brown and Shalon and that subsequently issued as United States Patent No. 5,807,522 (the 
Brown '522 patent). In December 1995, the subject matter of the Brown '522 patent was 
published based on a PCT patent application that had also been filed in June 1995. The Brown 
'522 patent (and its corresponding PCT application) describes the use of the Stanford-developed 
cDNA technology in a number of gene expression monitoring applications, as will be discussed 
more fully below. 

Upon Incyte's acquisition of Synteni, I became employed by Incyte. From early 
1998 until late 1999, 1 was an Associate Research Director at Incyte. In late 1999, 1 was 
promoted to the position of Director, Corporate Development. 

I have been aware of the Stanford-developed cDNA microarray technology since 
shortly before I commenced my employment at Synteni. While I was employed by Synteni, 
virtually all (if not all) of my work efforts (as well as the work efforts of others employed by 
Synteni) were directed to the further development and commercial exploitation of that cDNA 
microarray technology. By the end of 1997, those efforts had progressed to the point that I 
understand Incyte agreed to pay at least about $80 million to acquire Synteni. Since I have been 
employed by Incyte, I have continued to work on the further development and commercial 
exploitation of the cDNA microarray technology that was first developed at Stanford in the early 
to mid-1990s. 



3. I have reviewed the specification of a United States patent application that 
I understand was filed on April 23, 2001 in the names of Huei-Mei Chen et al. and was assigned 
Serial No. 09/870,746 (hereinafter "the Chen '746 application"). In broad overview, the Chen 
'746 specification pertains to certain nucleotide and amino acid sequences and their use in a 
number of applications, including gene expression monitoring applications that are useful in 
connection with (a) developing drugs (e.g., the diagnosis of inherited and acquired genetic 
disorders, expression profiling, toxicology testing, and drug development with respect to breast 
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cancer) and (b) monitoring the activity of drugs for purposes relating to evaluating their efficacy 
and toxicity. 

4. I understand that (a) the Chen ' 746 application contains claims that are 
directed to isolated and purified polynucleotides having the sequences of SEQ ID NO: 1 -encoding 
polynucleotides, for example SEQ ID NO:2 (hereinafter "the SEQ ID NO: 1 -encoding 
polynucleotides"), and (b) the Patent Examiner has rejected those claims on the grounds that the 
specification of the Chen '746 application does not disclose a substantial, specific and credible 
utility for the claimed SEQ ID NO: 1 -encoding polynucleotides. I further understand that 
whether or not a patent specification discloses a substantial, specific and credible utility for its 
claimed subject matter is properly determined from the perspective of a person skilled in the art 
to which the specification pertains at the time of the patent application was filed. In addition, I 
understand that a substantial, specific and credible utility under the patent laws must be a "real- 
world" utility. 

5. I have been asked (a) to consider with a view to reaching a conclusion (or 
conclusions) as to whether or not I agree with the Patent Examiner's position that the Chen '746 
application does not disclose a substantial, specific and credible "real-world" utility for the 
claimed SEQ ID NO: 1 -encoding polynucleotides, and (b) to state and explain the bases for any 
conclusions I reach. I have been informed that, in connection with my considerations, I should 
determine whether or not a person skilled in the art to which the Chen '746 application pertains 
on April 23, 2001 would have concluded that the Chen '746 application disclosed, for the benefit 
of the public, a specific beneficial use of the SEQ ID NO: 1 -encoding polynucleotides in their 
then available and disclosed form. I have also been informed that, with respect to the "real- 
world" utility requirement, the Patent and Trademark Office instructs its Patent Examiners in 
Section 2107 of the Manual of Patent Examining Procedure, under the heading "I. Specific and 
Substantial Requirement," sub-heading "Research Tools": 

"Many research tools such as gas chromatographs, screening assays, and 
nucleotide sequencing techniques have a clear, specific and unquestionable utility 
(e.g., they are useful in analyzing compounds). An assessment that focuses on 
whether an invention is useful only in a research setting thus does not address 
whether the specific invention is in fact 'useful' in a patent sense. Instead, Office 
personnel must distinguish between inventions that have a specifically identified 
substantial utility and inventions whose asserted utility requires further research to 
identify or reasonably confirm." 
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6. I have considered the matters set forth in paragraph 5 of this Declaration 
and have concluded that, contrary to the position I understand the Patent Examiner has taken, the 
specification of the Chen '746 patent application disclosed to a person skilled in the art at the 
time of its filing a number of substantial, specific and credible real -world utilities for the claimed 
SEQ ID NO: 1 -encoding polynucleotides. More specifically, persons skilled in the art on April 
23, 2001 would have understood the Chen '746 application to disclose the use of the SEQ ID 
NO: 1 -encoding polynucleotides in a number of gene expression monitoring applications that 
were well-known at that time to be useful in connection with the development of drugs and the 
monitoring of the activity of such drugs. I explain the bases for reaching my conclusion in this 
regard in paragraphs 7-16 below. 

7. In reaching the conclusion stated in paragraph 6 of this Declaration, I 
considered (a) the specification of the Chen 6 746 application, and (b) a number of published 
articles and patent documents that evidence gene expression monitoring techniques that were 
well-known before the April 23, 2001 filing date of the Chen '746 application. The published 
articles and patent documents I considered are: 

(a) Schena, M., Shalon, D., Heller, R., Chai, A., Brown, P.O., and 
Davis, R.W., Parallel human genome analysis: Microarrav-based expression monitoring of 1000 
genes , Proc. Natl. Acad. Sci. USA, 93, 10614-10619 (1996) (hereinafter "the Schena 1996 
article") (copy annexed at Tab A); 

(b) Schena, M., Shalon, D., Davis, R.W., Brown, P.O., Quantitative 
Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , Science, 270, 
467-470 (1995) (hereinafter "the Schena 1995 article") (copy annexed at Tab B); 

(c) Shalon and Brown PCT patent application WO 95/35505 titled 
"Method and Apparatus For Fabricating Microarrays Of Biological Samples," filed on June 16, 
1995, and published on December 28, 1995 (hereinafter "the Shalon PCT application") (copy 
annexed at Tab C); 

(d) Brown and Shalon U.S. Patent No. 5,807,522, corresponding to the 
Shalon PCT application, titled "Methods For Fabricating Microarrays Of Biological Samples," 
filed on June 7, 1995 and issued on September 15, 1998 (hereinafter "the Brown '522 patent") 
(copy annexed at Tab D); 

(e) DeRisi, J., Penland, L., and Brown, P.O. (Group 1); Bittner, M.L., 
Meltzer, P.S., Ray, M., Chen, Y., Su, Y.A., and Trent, J.M. (Group 2), Use ofacDNA 
microarrav to analyse gene expression patterns in human cancer , Nat. Genet., 14(4), 457-460 
(1996) (hereinafter "the DeRisi article") (copy annexed at Tab E); 
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(f) Shalon, D., Smith, S.J., and Brown, P.O., A DNA Microarray 
System for Analyzing Complex DNA Samples Using Two-color Fluorescent Probe 
Hybridization . Genome Res., 6(7), 639-645 (1996) (hereinafter "the Shalon article") (copy 
annexed at Tab F); 

(g) Heller, R.A., Schena, M., Chai A., Shalon, D., Bedilion, T., 
Gilmore, J., Woolley, D.E., and Davis R.W., Discovery and analysis of inflammatory disease- 
related genes using cDNA microarrays . Proc. Natl. Acad. Sci. USA, 94, 2150-2155 (1997) 
(hereinafter "the Heller article")(copy annexed at Tab G); 

(h) Sambrook, J., Fritsch, E.F., Maniatis, T., Molecular Cloning, A 
Laboratory Manual , pages 7.37 and 7.38, Cold Spring Harbor Press (1989) (hereinafter "the 
Sambrook Manual") (copy annexed at Tab H); 

8. Many of the published articles and patent documents I considered (i.e., at 
least items (a)-(f) identified in paragraph 7) relate to work done at Stanford University in the 
early and mid-1990s with respect to the development of cDNA microarrays for use in gene 
expression monitoring applications under which Synteni became exclusively licensed. As I will 
discuss, a person skilled in the art who read the Chen '746 application on April 23, 2001 would 
have understood that application to disclose the SEQ ID NO: 1 -encoding polynucleotides to be 
useful for a number of gene expression monitoring applications, e.g., as a probe for the 
expression of that specific polynucleotide in cDNA microarrays of the type first developed at 
Stanford. 

Furthermore, items (a)-(g) establish that gene expression monitoring applications 
utilizing cDNA microarrays were well-known and established methods routinely used in 
toxicology testing and drug development at the time of filing the Chen fi 746 application and for 
several years prior to April 23, 2001 . As such, one of ordinary skill in the art would have 
recognized that the SEQ ID NO: 1 -encoding polynucleotides could be used in toxicology testing 
and drug development, irrespective of the biochemical activities of the encoded polypeptide. 

9. Turning more specifically to the Chen '746 specification, the SEQ ID 
NO:2 polynucleotide is shown at pp. 3-7 as one of twenty sequences under the heading 
"Sequence Listing." The Chen '746 specification specifically teaches that the "the invention 
provides an isolated cDNA . . . selected from the group consisting of SEQ ID NO:2. . .(Chen '746 
application at p. 3). It further teaches that (a) the identity of the SEQ ID NO:2 polynucleotide 
was determined from a human fetal lung tissue library (LUNGFET05) (Chen '746 application, p. 
9), (b) the SEQ ID NO:2 polynucleotide encodes for the Mucin-Related Tumor Marker (MRTM 
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shown as SEQ ID NO:l (Chen '746 application at p. 3, lines 14-21, and (c) microarray 
experiments show the differential expression of MRTM in human breast adenocarcinoma cells 
(BT20) compared with normal breast epithelial cells (Chen '746 application at p. 10, lines 5-8. 

The Chen '746 application discusses a number of uses of the SEQ ID NO:l-encoding 
polynucleotides in addition to their use in gene expression monitoring applications. I have not 
fully evaluated these additional uses in connection with the preparation of this Declaration and 
do not express any views in this Declaration regarding whether or not the Chen '746 
specification discloses these additional uses to be substantial, specific and credible real-world 
utilities of the SEQ ID NO: 1 -encoding polynucleotides. Consequently, my discussion in this 
Declaration concerning the Chen '746 application focuses on the portions of the application that 
relate to the use of the SEQ ID NO: 1 -encoding polynucleotides in gene expression monitoring 
applications. 

1 0. The Chen '746 application discloses that the polynucleotide sequences 
disclosed therein, including the SEQ ID NO: 1 -encoding polynucleotides, are useful as probes in 
microarrays. It further teaches that the microarrays can be used "to monitor the expression level 
of large numbers of genes simultaneously" for a number of purposes, including "to develop and 
monitor the activities of therapeutic agents" (Chen '746 application at p. 14, lines 4-9). 

In the same paragraph of the Chen '746 application described above, the Chen 
'746 application teaches that microarrays can be prepared using the previously mentioned cDNA 
microarray technology developed at Stanford in the early to mid-1990s. In this connection, the 
Chen '746 application specifically cites to the Schena 1996 article identified in item (a) of 
paragraph 7 of this Declaration (Chen '746 application at p. 14 supra, paragraph 7). 

The Schena 1996 article is one of a number of documents that were published 
prior to the April 23, 2001 filing date of the Chen '746 application that describes the use of the 
Stanford-developed cDNA technology in a wide range of gene expression monitoring 
applications, including monitoring and analyzing gene expression patterns in human cancer. In 
view of the Chen '746 application, the Schena 1996 article, and other related pre- April 2001 
publications, persons skilled in the art on April 23, 2001 clearly would have understood the Chen 
'746 application to disclose the SEQ ID NO: 1 -encoding polynucleotides to be useful in cDNA 
microarrays for the development of new drugs and monitoring the activities of drugs for such 
purposes as evaluating their efficacy and toxicity, as explained more fully in paragraph 15 below. 

With specific reference to toxicity evaluations, those of skill in the art who were 
working on drug development in April 2001 (and for many years prior to April 2001) without 
any doubt appreciated that the toxicity (or lack of toxicity) of any proposed drug they were 
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working on was one of the most important criteria to be considered and evaluated in connection 
with the development of the drug. They would have understood at that time that good drugs are 
not only potent, they are specific. This means that they have strong effects on a specific 
biological target and minimal effects on all other biological targets. Ascertaining that a 
candidate drug affects its intended target, and identification of undesirable secondary effects (i.e., 
toxic side effects), had been for many years among the main challenges in developing new drugs. 
The ability to determine which genes are positively affected by a given drug, coupled with the 
ability to quickly and at the earliest time possible in the drug development process identify drugs 
that are likely to be toxic because of their undesirable secondary effects, have enormous value in 
improving the efficiency of the drug discovery process, and are an important and essential part of 
the development of any new drug. Accordingly, the teachings in the Chen '746 application, in 
particular regarding use of the SEQ ID NO: 1 -encoding polynucleotides in differential gene 
expression analysis and in the development and the monitoring of the activities of drugs, clearly 
includes toxicity studies and persons skilled in the art who read the Chen '746 application on 
April 23, 2001 would have understood that to be so. 

1 1 . The Schena 1 996 article was not the first publication that described the use 
of the cDNA microarray technique developed at Stanford to monitor quantitatively gene 
expression patterns. More than a year earlier (i.e., in October 1995), the Schena 1995 article, 
titled "Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA 
Microarray", was published (see Tabs A and B). 

12. As previously discussed {supra, paragraphs 2 and 7), in the mid-1990s 
patent applications were filed in the names of Drs. Shalon and Brown that described the 
Stanford-developed cDNA microarray technology. The two patent documents (i.e., the Shalon 
PCT application and the Brown '522 patent) annexed to this Declaration at Tabs C and D 
evidence information that was available to the public regarding the Stanford-developed cDNA 
microarray technology before the April 23, 2001 filing date of the Chen '746 application. 

The Shalon PCT patent application, which was published in December 1995, 
contains virtually the same (if not exactly the same) specification as the Brown '522 patent. 
Hence, the Brown '522 patent disclosure was, in effect, available to the public as of the 
December 1995 publication date of the Shalon PCT application (see Tabs C and D). For the sake 
of convenience, I cite to and discuss the Brown '522 specification below on the understanding 
that the descriptions in that specification were published as of the December 28, 1995 publication 
date of the Shalon PCT application. 
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The Brown '522 patent discusses, in detail, the utility of the Stanford-developed 
cDNA microarrays in gene expression monitoring applications. For example, in the "Summary 
Of The Invention" section, the Brown f 522 patent teaches (see Tab D, col. 4, line 52-col. 5, line 
8): 

Also forming part of the invention is a method of detecting 
differential expression of each of a plurality of genes in a first cell 
type, with respect to expression of the same genes in a second cell 
type. In practicing the method, there is first produced fluorescent- 
labeled cDNAs from mRNAs isolated from two cells types, where 
the cDNAs from the first and second cell types are labeled with 
first and second different flourescent reporters. 

A mixture of the labeled cDNAs from the two cell types is 
added to an array of polynucleotides representing a plurality of 
known genes derived from the two cell types, under conditions that 
result in hybridization of the cDNAs to complementary-sequence 
polynucleotides in the array. The array is then examined by 
fluorescence under fluorescence excitation conditions in which (i) 
polynucleotides in the array that are hybridized predominantly to 
cDNAs derived from one of the first or second cell types give a 
distinct first and second fluorescence emission color, respectively, 
and (ii) polynucleotides in the array that are hybridized to 
substantially equal numbers of cDNAs derived from the first and 
second cell types give a distinct combined fluorescence emission 
color, respectively. The relative expression of known genes in the 
two cell types can then be determined by the observed fluorescence 
emission color of each spot. 

The Brown f 522 patent further teaches that the u [m]icroarrays of immobilized 
nucleic acid sequences prepared in accordance with the invention" can be used in "numerous" 
genetic applications, including "monitoring of gene expression" applications (see Tab D at col. 
14, lines 36-42). The Brown '522 patent teaches (a) monitoring gene expression (i) in different 
tissue types, (ii) in different disease states, and (iii) in response to different drugs, and (b) that 
arrays disclosed therein may be used in toxicology studies (see Tab D at col. 15, lines 13-18 and 
52-58 and col. 18, lines 25-30). 

13. Also pertinent to my considerations underlying this Declaration is the 
DeRisi article, published in December 1996. The DeRisi article describes the use of the 
Stanford-developed cDNA microarray technology "to analyze gene expression patterns in human 
cancer" (see Tab E at, e.g., p. 457). The DeRisi article specifically indicates, consistent with 
what was apparent to persons skilled in the art in December 1996, that increasing the number of 
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genes on the cDNA microarray permits a "more comprehensive survey of gene expression 
patterns," thereby enhancing the ability of the cDNA microarray to provide "new and useful 
insights into human biology and a deeper understanding of the gene pathways involved in the 
pathogenesis of cancer and other diseases" (see Tab E at p. 458). 

1 4. Other pre- April 200 1 publications further evidence the utility of the 
cDNA microarrays first developed at Stanford in a wide range of gene expression monitoring 
applications (see, e.g., the Shalon and the Heller articles at Tabs F and G). By no later than the 
March 1997 publication of the Heller article, these publications showed that employees of 
Synteni (i.e., James Gilmore and myself) had used the cDNA microarrays in specific gene 
expression monitoring applications (see Tab G). 

The Heller article states that the results reported therein "successfully demonstrate 
the use of the cDNA microarray system as a general approach for dissecting human diseases" 
(Tab G at p. 21 50). Among other things, the Heller article describes the investigation of "1 000 
human genes that were randomly selected from a peripheral human blood cell library" and 
"[tjheir differential and quantitative expression analysis in cells of the joint tissue. . . to 
demonstrate the utility of the microarray method to analyze complex diseases by their pattern of 
gene expression" (see Tab G at pp. 2150 et seq.). 

Much of the work reported on in the Heller article was done in 1996. That article, 
therefore, evidences how persons skilled in the art were readily able, well prior to April 23, 
2001, to make and use cDNA microarrays to achieve highly useful results. For example, as 
reported in the Heller article, a cDNA microarray that was used in some of the highly successful 
work reported on therein was made from 1,000 genes randomly selected from a human blood cell 
library. 



1 5. A person skilled in the art on April 23, 2001 , who read the Chen '746 
application, would understand that application to disclose the SEQ ID NO: 1 -encoding 
polynucleotides, for example, SEQ ID NO:2, to be highly useful as probes for the expression of 
that specific polynucleotide in cDNA microarrays of the type first developed at Stanford. For 
example, the specification of the Chen '746 application would have led a person skilled in the art 
in April 2001 who was using gene expression monitoring in connection with working on 
developing new drugs for the treatment of breast cancer to conclude that a cDNA microarray that 
contained the SEQ ID NO: 1 -encoding polynucleotides would be a highly useful tool and to 
request specifically that any cDNA microarray that was being used for such purposes contain the 
SEQ ID NO: 1 -encoding polynucleotides. Persons skilled in the art would appreciate that cDNA 
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microarrays that contained the SEQ ID NO: 1 -encoding polynucleotides would be a more useful 
tool than cDNA microarrays that did not contain the polynucleotides in connection with 
conducting gene expression monitoring studies on proposed (or actual) drugs for treating breast 
cancer for such purposes as evaluating their efficacy and toxicity. 

I discuss in more detail in items (a)-(f) below a number of reasons why a person 
skilled in the art, who read the Chen '746 specification in April 2001, would have concluded 
based on that specification and the state of the art at that time, that the SEQ ID NO: 1 -encoding 
polynucleotides would be a highly useful tool for inclusion in cDNA microarrays for evaluating 
the efficacy and toxicity of proposed drugs for treating breast cancer, as well as for other 
evaluations: 

(a) The Chen '746 application teaches the SEQ ID NO: 1 -encoding 
polynucleotides to be useful as probes in cDNA microarrays of the type first developed at 
Stanford. It also teaches that such cDNA microarrays are useful in a number of gene expression 
monitoring applications, including "developing and monitoring the activity of therapeutic agents 
[i.e., drugs]" (see paragraph 10, supra), 

(b) By April 2001, the Stanford-developed cDNA microarray technology 
was a well known and widely accepted tool for use in a wide range of gene expression 
monitoring applications. This is evidenced, for example, by numerous publications describing 
the use of that cDNA technology in gene expression monitoring applications and the fact that, for 
over a year, the technology had provided the basis for the operations of an up-and-running 
company (Synteni), with employees, that was created for the purpose of developing and 
commercially exploiting that technology (see paragraphs 2, 8 and 10-14, supra). The fact that 
Incyte agreed to purchase Synteni in late 1997 for an amount reported to be at least about $80 
million only serves to underscore the substantial practical and commercial significance, in 1997, 
of the cDNA microarray technology first developed at Stanford (see paragraph 2, supra). 

(c) The pre- April 2001 publications regarding the cDNA microarray 
technology first developed at Stanford that I discuss in this Declaration repeatedly confirm that, 
consistent with the teachings in the Chen '746 application, cDNA microarrays are highly useful 
tools for conducting gene expression monitoring applications with respect to the development of 
drugs and the monitoring of their activity. Among other things, those pre- April 2001 
publications confirmed that cDNA microarrays (i) were useful for monitoring gene expression 
responses to different drugs (see paragraph 12, supra), (ii) were useful in analyzing gene 
expression patterns in human cancer, with increasing the number of genes on the cDNA 
microarray enhancing the ability of the cDNA microarray to provide useful information (see 
paragraph 13, supra), and (iii) were a valuable tool for use as part of a "general approach for 
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dissecting human diseases" and for "analyzing] complex diseases by their partem of gene 
expression" (see paragraph 14, supra). 

(d) Based on my own extensive work for a company whose business was 
the development and commercial exploitation of cDNA microarray technology for more than two 
years prior to the April 2001 filing date of the Chen '746 application, I have first-hand 
knowledge concerning the state of the art with respect to making and using cDNA microarrays as 
of April 23, 2001 (see paragraphs 2 and 14, supra). Persons skilled in the art as of that date 
would have (a) concluded that the Chen '746 application disclosed cDNA microarrays containing 
the SEQ ID NO: 1 -encoding polynucleotides to be useful, and (b) readily been able to make and 
use such microarrays with useful results. 

(e) The Chen '746 specification contains a number of teachings that would 
lead persons skilled in the art on April 23, 2001 to conclude that a cDNA microarray that 
contained the SEQ ID NO: 1 -encoding polynucleotides would be a more useful tool for gene 
expression monitoring applications relating to drugs for treating breast cancer than a cDNA 
microarray that did not contain the SEQ ID NO: 1 -encoding polynucleotides. Among other 
things, the Chen '746 specification teaches that the identity of the SEQ ID NO:2 polynucleotide 
was determined from a fetal lung tissue cDNA library (LUNGFET05) (Chen '746 application, p. 
9, lines 25-26). Moreover, microarray experiments show the differential expression of MRTM in 
human breast adenocarcinoma cells (BT20) compared with normal breast epithelial cells (Chen 
'746 application at p. 10, lines 5-8. (See paragraph 9, supra). 

(f) Persons skilled in the art on April 23, 2001 would have appreciated (i) 
that the gene expression monitoring results obtained using a cDNA microarray containing a 
probe to a sequence selected from the group consisting of SEQ ID NO: 1 -encoding 
polynucleotides would vary, depending on the particular drug being evaluated, and (ii) that such 
varying results would occur both with respect to the results obtained from the probe described in 
(i) and from the cDNA microarray as a whole (including all its other individual probes). These 
kinds of varying results, depending on the identity of the drug being tested, in no way detracts 
from my conclusion that persons skilled in the art on April 23, 2001, having read the Chen '746 
specification, would specifically request that any cDNA microarray that was being used for 
conducting gene expression monitoring studies on drugs for treating breast cancer {e.g., a 
toxicology study or any efficacy study of the type that typically takes place in connection with 
the development of a drug) contain any one of the SEQ ID NO: 1 -encoding polynucleotides as a 
probe. Persons skilled in the art on April 23, 2001 would have wanted their cDNA microarray to 
have a probe as described in (i) because a microarray that contained such a probe (as compared to 
one that did not) would provide more useful results in the kind of gene expression monitoring 
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studies using cDNA microarrays that persons skilled in the art have been doing since well prior 
to April 23, 2001. 

The foregoing is not intended to be an all-inclusive explanation of all my reasons 
for reaching the conclusions stated in this paragraph 15, and in paragraph 6, supra. In my view, 
however, it provides more than sufficient reasons to justify my conclusions stated in paragraph 6 
of this Declaration regarding the Chen '746 application disclosing to persons skilled in the art at 
the time of its filing substantial, specific and credible real-world utilities for the SEQ ID NO.l- 
encoding polynucleotides. 

16; Also pertinent to my considerations underlying this Declaration is the fact 
that the Chen '746 disclosure regarding the uses of the SEQ ID NO:2 polynucleotide for gene 
expression monitoring applications is not limited to the use of that polynucleotide as a probe in 
microarrays. For one thing, the Chen '746 specification teaches that the polynucleotides 
described therein (including the polynucleotide of SEQ ID NO:2) may desirably be used as 
probes in any of a number of long established "standard" non-microarray techniques, such as 
Northern analysis, for conducting gene expression monitoring studies. See, e.g.: 

(a) Chen '746 application at p. 8, lines 13-16 ("Probe" refers to a cDNA 
that hybridizes to at least one nucleic acid in a sample. Where targets are single stranded, probes 
are complementary single strands. Probes can be labeled with reporter molecules for use in 
hybridization reactions including Southern, northern, in situ, dot blot, array, and like 
technologies ...."); and 

(b) Chen '746 application at p. 18, lines 11-15 ("In order to provide 
standards for establishing differential expression, normal and disease expression profiles are 
established. This is accomplished by combining a sample taken from normal subjects, . . . with a 
cDNA under conditions for hybridization to occur. Standard hybridization complexes may be 
quantified by comparing the values obtained using normal subjects with values from an 
experiment in which a known amount of a purified sequence is used") (emphasis supplied). 

The "Sambrook et al." reference is a reference that was well known to persons 
skilled in the art in August 2000 and is cited in the Chen '746 application at p. 14. A copy of 
pages from that reference manual, which was published in 1989, is annexed to this Declaration at 
Tab H. The attached pages from the Sambrook manual provide an overview of northern analysis 
and other membrane-based technologies for conducting gene expression monitoring studies that 
were known and used by persons skilled in the art for many years prior to the April 23, 2001 
filing date of the Chen '746 application. 
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A person skilled in the art on April 23, 2001, who read the Chen '746 
specification, would have routinely and readily appreciated that the SEQ ID NO: 1 -encoding 
polynucleotides disclosed therein would be useful as a probe to conduct gene expression 
monitoring analyses using northern analysis or any of the other traditional membrane-based gene 
expression monitoring techniques that were known and in common use many years prior to the 
filing of the Chen '746 application. For example, a person skilled in the art in April 2001 would 
have routinely and readily appreciated that the SEQ ID NO: 1 -encoding polynucleotides would be 
a useful tool in conducting gene expression analyses, using the northern analysis technique, in 
furtherance of (a) the development of drugs for the treatment of breast cancer, and (b) analyses of 
the efficacy and toxicity of such drugs. 
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1 7. I declare further that all statements made herein of my own knowledge are 
true and that all statements made herein on information and belief are believed to be true; and 
further, that these statements were made with the knowledge that willful false statements and the 
like so made are punishable by fine or imprisonment, or both, and that willful false statements 
may jeopardize the validity of this application and any patent issuing thereon. 




Tod Bedilion 



Signed^. Redwood City, California 
this^l day of July 2003 
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So, how many genes are there? 

Although the sequence of the 3 billion bases (A's, C's, G's, and T's) that 
make up the human genome has been determined, the exact number of 
genes encoded by the genome is still unknown. The most recent predictions 
estimate around 30,000 genes, much lower than previous estimates of 
80,000 to 140,000. 

This lower estimate came as a shock to many scientists, because counting 
genes was viewed as a way of quantifying genetic complexity. With 
30,0000 genes, the human gene count would be only one-third greater than 
that of the simple roundworm C. elegans (-20,000 genes).- For scientists 
who support this lower estimate, biological complexity is explained by gene 
control and expression rather than by number. 

Studies since the publication of the draft genome sequence have generated 
estimates that differ greatly. A study conducted by scientists at Ohio State 
University suggests between 65,000 and 75,000 human genes.- Another 
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study published in Cell in August 2001 predicted a total human gene count 
of 42,000.2 

The ongoing debate has generated a gene-count betting pool called "Gene 
Sweepstake." Each genome scientist participating in the Cold Spring Harbor 
Laboratory (CSHL) Genome Meeting is eligible to place a bet on a gene 
number. The winning number will be determined at the May 2003 CSHL 
Genome Meeting. The bets range from around 27,000 to more than 150,000. 
For details and updates on Gene Sweepstake, go to the Web site . 

It could be years before a truly reliable gene count can be assessed. The 
reason for so much discrepancy is that these predictions are derived from 
different computational methods and gene-finding algorithms. Computation 
alone is simply not enough to generate an accurate gene number. While 
gene-counting programs can identify patterns and phenomena that scientists 
have seen before, the programs are unable to recognize new phenomena. 
Clearly, gene predictions will have to be verified by labor intensive work in 
the laboratory before the scientific community can reach any real 

consensus 



Related Web Sites 

Build 33 - Release notes for the most current build of the human genome 
(based on sequence data available April 10, 2003) used by NCBI in its 
genome browser called Map Viewer. Predicts 26,846 genes. This is the 
same build used by the University of California, Santa Cruz (UCSC) 
Human Genome Browser. More statistics for this build also are provided. 

• Homo sapiens Genome View - Browse Build 33 of the human 
genome using NCBI's Map Viewer. Clicking on a chromosome will 
display a chromosome map that provides the total number of genes 
mapped to that chromosome. 

Ensembl Human release 13.31.1 - The most current human genome release 
available from the European Bioinformatics Institute's human genome 
browser. This release was built around NCBI's Build 31. Predicts 24,847 
genes. 

Count of Mapped Genes by Chromosome - See how many genes have been 
mapped to each chromosome. Provided by the Genome Database. 
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ABSTRACT We have developed a simple procedure based 
on reassociation kinetics that can reduce effectively the high 
variation in abundance among the clones of a cDNA library 
that represent individual mRNA species. For this normaliza- 
tion, we used as a model system a library of human infant brain 
cDNAs that were cloned directionally into a phagemid vector 
and, thus, could be easily converted into single-stranded cir- 
cles. After controlled primer extension to synthesize a short 
complementary strand on each circular template, melting and 
reannealing of the partial duplexes at relatively low Cot, and 
hydroxyapatite column chromatography, unreassociated cir- 
cles were recovered from the flow through fraction and dec- 
troporated into bacteria, to propagate a normalized library 
without a requirement for subcloning steps. An evaluation of 
the extent of normalization has indicated that, from an extreme 
range of abundance of 4 orders of magnitude in the original 
library, the frequency of occurrence of any clone examined In 
the normalized library was brought within the narrow range of 
only 1 order of magnitude. 



The mRNAs of a typical somatic cell are distributed in three 
frequency classes (1, 2) that are presumably maintained in 
representative cDNA libraries. The classes at the two ex- 
tremes (ca. 10% and 40-45% of the total, respectively) 
include members occurring at vastly different relative fre- 
quencies. On average, the most prevalent class consists of 
about 10 mRNA species, each represented by 5000 copies per 
cell, whereas the class of high complexity comprises 15,000 
different species each represented by 1—15 copies only. Rare 
mRNAs are even more under represented in the brain, a 
tissue exhibiting an exceptionally high sequence complexity 
of transcripts (3-5). 

Although even the rarest mRNA sequence from any tissue 
is likely to be represented in a cDNA library of 10 7 recom- 
binants, its identification is very difficult (its frequency of 
occurrence may be as low as 2 x 10 " 6 on average or even 10~ 7 
for complex tissues such as the brain). Thus, for a variety of 
purposes, it is advantageous to apply a normalization pro- 
cedure and bring the frequency of each clone in a cDNA 
library within a narrow range (generation of a perfectly 
equimolar cDNA library is practically impossible in our 
experience). Normalized cDNA libraries can facilitate posi- 
tional cloning projects aiming at the identification of disease 
genes, can increase the efficiency of subtractive hybridiza- 
tion procedures, and can significantly facilitate genomic 
research pursuing chromosomal assignment of expressed 
sequences and their localization in large fragments of cloned 
genomic DNA (exon mapping). Normalization makes feasi- 
ble the gridding of cDN A libraries on filters at high density by 
reducing the number of clones to be arrayed (gridding 10 7 
clones for 1 x coverage of a non-normalized library is not a 
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feasible task). Finally, by increasing the frequency of occur- 
rence of rare cDNA clones while decreasing simultaneously 
the percentage of abundant cDNAs, normalization can ex- 
pedite significantly the development of expressed sequence 
databases by random sequencing of cDNAs. 

Although cDNA library normalization could be achieved 
by saturation hybridization to genomic DNA (6), this ap- 
proach is impractical, since it would be extremely difficult to 
provide saturating amounts of the rarer cDNA species to the 
hybridization reaction. The alternative is the use of reasso- 
ciation kinetics: assuming that cDNA reannealing follows 
second-order kinetics, rarer species will anneal less rapidly 
and the remaining single-stranded fraction of cDNA will 
become progressively normalized during the course of the 
reaction (6-8). As we report here, we have used this kinetic 
principle to develop a method for normalization of a direc- 
tionally cloned cDNA library that has significant advantages 
over two previously reported similar procedures (refs. 7 and 
8; see Results and Discussion). 

MATERIALS AND METHODS 

cDNA Library Construction. Poly(A) + RNA isolated from 
the entire brain of a female infant (72 days old), who died in 
consequence of spinal muscular atrophy, was used for con- 
struction of a cDNA library (IB) as described (9, 10). As a 
primer for first-strand cDNA synthesis, we used the oligo- 
nucleotide 5'-AACTGGAAGAATTC GCGGCCGCA G- 
GAATi8-3', which contains a Not I site (underlined). After 
ligation to Hindlll adaptors, the cDNAs were digested with 
Not I and cloned directionally into the Hindlll and Not 1 sites 
of a phagemid vector (L-BA) constructed by modifying 
pEMBL-9(+) (11). L-BA carries an ampicillin-resistance 
gene, plasmid and filamentous phage (fl) origins of replica- 
tion, and cloning sites (5' Hindlll-BamKl~Not l-E coRI 3'). 
Superinfection of bacteria with the helper phage M13K07 (12) 
converts duplex plasmids into single-stranded DNA circles 
containing message-like strands of the cDNA inserts. 

Preparation of Single-Stranded Library DNA. Plasmid 
DNA from the IB library was electroporated into Escherichia 
coli DH5a F' bacteria, and the culture was grown under 
ampicillin selection at 37°C to an OD^o of 0.2, superinfected 
with a 20-fold excess of the helper phage M13K07, and 
harvested after 4 hr for preparation of single-stranded plas- 
mids, as described (12). To eliminate contaminating double- 
stranded, implicative form (RF) DNA, 20 /ig of the prepara- 
tion was digested with Pvuli (which cleaves only duplex 
DNA molecules), extracted with phenol/chloroform, diluted 
by addition of 2 ml of loading buffer (0.12 M sodium phos- 
phate buffer, pH 6.8/10 mM EDTA/1% SDS), and purified 
by hydroxyapatite (HAP) chromatography at 60°C, using a 
column preequilibrated with the same buffer (1-ml bed vol- 
ume; 0.4 g of HAP). After a 6-ml wash with loading buffer, 
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this volume was combined with the flow through fraction, 
and the sample was extracted twice with water-saturated 
2-butanoI, once with dry 2-butanol, and once with water- 
saturated ether (3 volumes per extraction). The sample was 
desalted by passage through a Nensorb column (DuPont/ 
NEN) according to the manufacturer's specifications, con- 
centrated by ethanol precipitation, and electrophoresed in a 
low-melting agarose gel to remove helper phage DNA and 
any residual tRNA contaminant or oligoribonucleotides 
(breakdown products from the RNase A digestion used 
during purification). The region of the gel containing the 
single- stranded library was excised and, after 0-agarase 
(New England Biolabs) digestion, the DNA was purified and 
ethanol-precipitated . 

cDNA Library Normalization. The IB cDNA library was 
normalized (see Fig. 1) in two consecutive rounds to derive 
the normalized libraries ^IB and ^IB, by using the follow- 
ing procedure. To synthesize a partial second strand of about 
200 nt by limited extension, 9 pmol of the oligonucleotide 
primer 5'-GGCCGCAGGAAT w -3' was added to 4.5 pmol of 
single-stranded IB library DNA in a 150-/il reaction mixture 
containing 30 mM TrisHCl (pH 7.5); 50 mM NaCl; 15 mM 
MgCl 2 ; 1 mM dithiothreitol; 0.1 mM dNTPs; 2.5 mM ddATP, 
ddCTP, and ddGTP; and a trace of [a- 32 P]dCTP. The mixture 
was incubated for 5 min at 60°C and for 15 min at 50°C, the 
temperature was lowered to 37°C, 75 units of Klenow DNA 
polymerase (United States Biochemical) was added, and the 
incubation was continued for 30 min. The reaction was 
terminated by addition of EDTA (20 mM), extracted with 
phenol/chloroform, diluted with 2 ml of HAP loading buffer 
containing 50 /ig of sonicated and denatured salmon sperm 
DNA carrier, and chromatographed on HAP, as described 
above. After washing, the partial duplex circles bound to 
HAP were eluted from the column with 6 ml of 0.4 M 
phosphate buffer, pH 6.8/10 mM EDTA/1% SDS. The 
concentration of phosphate in the eluate was lowered to 0.12 
M by addition of 14 ml of water containing 50 /ig of DNA 
carrier, and the chromatographic step was repeated. The final 
eluate was extracted and desalted as described above and the 
DNA was ethanol-precipitated. The pellet (112 ng) was 
dissolved in 2.5 /xl of formamide and the sample was heated 
for 3 min at 80°C under a drop of mineral oil to dissociate the 
DNA strands. For an annealing reaction, the volume was 
brought to 5 /d by adding 0.5 yX of 0.1 M Tris-HCl, (pH 7.5) 
containing 0.1 M EDTA, 0.5 of 5 M NaCl, 1 /d (5 y%) of 
(dT)25_3o, and 0.5 pi (0.5 /ig) of the extension primer. The last 
two ingredients were added to block stretches of adenine 
residues [representing the initial poly(A) tails] and regions 
complementary to the oligonucleotide on the single-stranded 
DNA circles. The annealing mixture was incubated at 42°C, 
and a 0.5-jud aliquot was withdrawn at 13 hr (calculated Co/, 
5.5). The unhybridized single-stranded circles (normalized 
library) were separated from the reassociated partial du- 
plexes by HAP chromatography and then recovered from the 
flow through fraction as described above. Since we, and 
others (13), have observed that the electroporation efficiency 
of partially repaired circular molecules is increased by about 
100-fold in comparison with single-stranded circles, the nor- 
malized cDNA circles were converted to partial duplexes by 
primer extension using random hexamers and T7 DNA 
polymerase (Sequenase version II; United States Biochem- 
ical), in a 10-20 reaction mixture containing 1 mM dNTPs. 
After addition of EDTA to 20 mM, phenol extraction, and 
ethanol precipitation, the cDNAs were dissolved in 10 mM 
Tris-HCl, pH 7.5/1 mM EDTA, and electroporated into 
competent bacteria (DH10B; GIBCO/BRL). To determine 
the number of transformants, 1 hr after the electroporation a 
10-pd aliquot of the culture was plated on an LB agar plate 
containing 75 fig/xrH ampicillin (extrapolation from these data 
indicated that a normalized library of 2.5 x 10 6 colonies was 



obtained). Supercoiled plasmid DNA was then prepared 
ONIB library) with a Qiagen plasmid kit (Qiagen, Chats- 
worth, CA). The same protocol was used for a second round 
of normalization (calculated C 0 /, 2.5) to derive the 2 NIB 
library (1.3 x 10 7 transformants) from a preparation of X NIB 
single-stranded circles, except that the HAP purification step 
after primer extension to synthesize short complementary 
strands was omitted. 

Colony Hybridization. For screening, colonies were grown 
on duplicate nylon filters (GeneScreenPiuj; DuPont/NEN) 
that were processed as described (14) and hybridized at 42°C 
in 50% formamide/5x Denhardt's solution/0.75 M NaCl/ 
0.15 M Tris*HCl, pH 7.5/0.1 M sodium phosphate/0.1% 
sodium pyrophosphate/2% SDS containing sheared and de- 
natured salmon sperm DNA at 100 Mg/ml. Radioactive 
probes were prepared by random primed synthesis (15, 16) 
using the Prime-it II kit (Stratagene). 

DNA Sequencing. Double-stranded plasmid DNA tem- 
plates were prepared by using the Wizard Minipreps DNA 
purification system (Promega) and sequenced from both ends 
by using the universal forward and reverse M13 fluorescent 
primers. Reactions were assembled on a Biomek 1000 work- 
station (Beckman) and then transferred to a thermocycler 
(Perkin-Elmer/Cetus) for cycle sequencing. Reaction prod- 
ucts were analyzed on an automated 370A DNA sequencer 
(Applied Biosy stems). Nucleic acid and protein database 
searches were performed at the National Center for Biotech- 
nology Information server using the blast algorithm (17). 

RESULTS AND DISCUSSION 

Experimental Strategy. To develop a normalization proce- 
dure, shown schematically in Fig. 1, and at the same time 
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Fig. 1. Diagram of the nor- 
malization procedure. Single- 
stranded circles of a library of 
directionally cloned cDNAs are 
primer extended under con- 
trolled conditions to generate 
complementary strands of 
about 200 ± 20 nt, and the re- 
sulting partial duplexes are pu- 
rified from unprimed circles by 
HAP chromatography. Bound 
DNA is melted and reannealed 
to a relatively low Co/ (see text). 
The remaining single-stranded 
circles (normalized library) are 
isolated by HAP chromatogra- 
phy, converted into partial du- 
plexes by random priming, and 
electroporated into bacteria for 
amplification. 
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increase the utility of the normalized model cDNA library, 
we first constructed a high-quality brain cDNA library (IB) 
that has the following features (10): the average size of a 
cDNA insert is 1.7 kb, often providing coding-region infor- 
mation by sequencing from the 5' end; the length of the 
segment representing the mRNA poly(A) tail is short, allow- 
ing an increase in the output of useful sequencing information 
from the 3' end; the frequency of nonrecombinant clones is 
extremely low (0.1%); and chimeric cDNAs have not been 
encountered, after single-pass sequencing of >2000 clones 
(10, 18). However, the latter analysis also demonstrated that 
13% of the clones in the IB library lacked poly(A) tails and 
were presumably derived from aberrant priming. 

To preserve the length of the cDNAs, avoid differential 
loss of sequences, and alleviate a need for subcloning steps 
after normalization, we excluded from our protocol the use of 
PCR and chose directional cloning into a phagemid vector. 
Such vectors have been previously used advantageously for 
cDNA library subtractions (13), although normalization was 
not attempted. This cloning regime readily provides single 
strands that can be used both for annealing and for direct 
propagation in bacteria. In control experiments (data not 
shown), we assessed the frequency of occurrence of abun- 
dant cDNAs (encoding a- and 0- tubulin, elongation factor la, 
and myelin basic protein) and demonstrated that, at least by 
this criterion, the representation of clones in the starting 
library remained unchanged after conversion into single- 
stranded circles. We also note that electrophoretic purifica- 
tion of the circles prior to use is necessary, to remove 
contaminating oligoribonucleotides (see Materials and Meth- 
ods), whose presence would result in undesirable internal 
priming events during the first step of our protocol. 



In contrast with our scheme, two other PCR-based nor- 
malization methods (7, 8) necessitate the use of subcloning 
steps. In one of these approaches (7), sheared cDNAs 
(0.2-0.4 kb) were ligated to a linker-primer, amplified by 
PCR, normalized kinetically, reamplified, and finally cloned 
directionally in such a way that only 3 '-terminal sequences 
(almost exclusively 3' noncoding regions) were purposely 
preserved. The steps of the second scheme (8) were similar, 
except that the process started from cloned, randomly 
primed, and relatively short cDNAs, initially selected to 
minimize length-dependent differential PCR amplification. 
Thus, both coding and noncoding regions were represented in 
the final normalized library, but in pieces. 

While maintaining length and representation of mRNA 
regions, our protocol (Fig. 1) also addresses successfully the 
problem recognized in the first of the alternative approaches 
(7). It was considered that the 3' noncoding region is almost 
always unique to the transcript that it represents and is 
expected, therefore, to anneal only to its complement. In 
contrast, cross-hybridization of coding regions belonging to 
unequally represented members of oligo- or multigene fam- 
ilies could result in the elimination of rarer members from the 
population during the normalization process. This possibility 
is precluded in our method, which begins with the synthesis, 
from the 3 1 end of the cDNA, of a short complementary 
strand on the circular single-stranded cDNA template under 
controlled conditions, calibrated to yield strands with a 
narrow size distribution (200 ± 20 nt). Since the average 
length of 3' noncoding regions in brain mRNAs is 750 nt (19), 
the vast majority of synthesized complementary strands 
participating in the annealing reaction should be devoid of 
coding region sequences. However, after this partial exten- 
sion step, purification of the products by HAP chromatog- 
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Fig. 2. Comparison of the frequencies of cDNA probes in the original (IB) and two normalized ('NIB and 2 NIB) libraries. The indicated 
percentages of 28 cDNA sequences in the three libraries, tabulated in order of decreasing frequency in the IB library, are shown in the form 
of a histogram to visualize normalization. Frequencies were calculated from the number of positive colonies after hybridization of duplicate filters 
containing 500-180,000 colonies from each of the three cDNA libraries with the following 28 probes: 1, elongation factor la; 2, a-tubulin; 3, 
^-tubulin; 4, myelin basic protein; 5, aldolase; 6, 89-kDa heat shock protein; 7, y-actin; 8, secretogranin; 9, microtubule-associated protein; 11, 
vimentin; 13, a cDNA randomly picked from the 'NIB library similar to a mouse cysteine-rich intestinal protein ( 1 NIB-2, GenBank accession 
nos. T09996 and T09997); 19, a cDNA isolated from the 'NIB library homologous to the human endogenous retrovirus RTVLH2 (cDNA-20, 
accession nos. L13822 and L13823); 20, histone H2b.l; 23, a cDNA randomly picked from the 'NIB library encoding the human polyposis (DPI 
gene) mRNA ('NIB-227, accession nos. T10266 and T10267); 27, a cDNA randomly picked from the 'NIB library related to the human 
endogenous retrovirus ERV9 gene ('NIB-IK, accession nos. T10086 and T10087); the remaining brain cDNAs are novel, and except for nos. 
10, 18, 21, and 25, they were randomly picked from the l NIB library. 
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raphy is necessary to eliminate single strands of the IB library 
lacking poly(A) tails that cannot participate in primed syn- 
thesis. We repeat the chromatographic step to reduce the 
background to negligible levels, since after the first passage 
through the HAP column about 0.1% of pure single strands 
bind nonspecifically. However, during the second round of 
normalization to derive the 2 NIB library, we omitted this step 
since we showed that 187 clones, which were picked ran- 
domly and sequenced from the *NIB library (see below), all 
contained 3' poly(A) stretches. The remaining steps of our 
procedure entail melting and reannealing of the partial du- 
plexes, followed by purification of unreassociated circles 
(normalized library) by HAP chromatography and electro- 
poration into bacteria (Fig. 1). 

Characterization of Normalized cDNA Libraries. To eval- 
uate the extent of normalization achieved with our method, 
we compared the IB, ^IB, and 2 NIB libraries by colony 
hybridization. For this analysis, we used 28 cDNA probes 
chosen to represent various frequencies of occurrence within 
a wide range (at least 4 orders of magnitude: 4.6% to 
<0.0006%) in the IB library (Fig. 2). However, an additional 
comparison of these results with independent theoretical 
estimates was necessary, to provide a further assessment of 
the degree of normalization, especially because the ^IB 
library was derived after incubation to a relatively low Co* 
(5.5) during the reannealing step of our procedure. When 
relatively high Cot values were used in our initial attempts to 
normalize the IB library, we obtained unsatisfactory results 
(high background) that we attribute to technical problems 
inherent to the procedure. Nevertheless, a reevaluation of 
brain cDNA hybridization data (ref. 20; see Table 1) suggests 
that a relatively low Co/ would suffice for our purpose, to 
bring the frequency of each library clone within a narrow 
range. 

For our calculations (Table 1), which should be regarded as 
rough but indicative estimates, we used a set of reliable 
hybridization data that are available only for mouse brain 
mRNAs (20), assuming that these measurements should not 
differ significantly among mammals (in all cases examined, 

Table 1. Estimates of frequencies of brain mRNAs 



including humans, the average amount of RNA per brain cell 
and the number of cells per gram of tissue are practically the 
same; see, e.g., refs. 29 and 30). These calculations show that 
at Cot 5.5, of the three kinetic classes of mRNAs, the most 
abundant species are drastically diminished, while all fre- 
quencies are brought within the range of 1 order of magnitude 
(Table 1, compare columns b and h and columns f and i). Our 
experimental results (Fig. 2) show that the same range was 
achieved after a single round of normalization at this Cof 
(5.5). Thus, for all practical purposes, a single cycle is 
probably sufficient. Secondary normalization (calculated Cof 
= 2.5) to derive the 2 NIB library, although it did not result in 
a dramatic improvement, preserved the range of frequencies, 
while making the differences among individual sequences 
narrower overall (Fig. 2). Eleven of the 28 probes used in this 
analysis were derived from clones that were randomly picked 
from the *NIB library. The overall frequency fold variation 
was reduced from >7667 (4.6/<0.0006) in the IB library to 
133 (0.4/0.003) and 26 (0.1/0.01) in the *NIB and 2 NIB 
libraries, respectively. However, some unexplained anoma- 
lies were also observed for a small minority of clones, whose 
already reduced frequencies in the l NIB library were some- 
what increased in the *NIB library (Fig. 2). 

To provide a further indication that normalization was 
successful, we sequenced from both ends 187 cDNA clones 
that were randomly picked from the *NIB library (GenBank 
accession numbers T09994-T10011 and T10014-T10369). 
With the exception of 4 clones, which carried sequences 
corresponding to human mitochondrial 16S rRNA, all other 
cDNAs of this pool were unique, in agreement with the 
expectation for a normalized library. To further investigate 
the effect of the normalization procedure on the subset of 
mitochondrial 16S rRNA clones (1.4%, 1%, and 0.4% in the 
IB, X NIB and 2 NIB libraries, respectively), we compared the 
sequences of a number of 16S rRNA clones isolated from 
both the IB and *NIB libraries (kindly provided by M. Adams, 
Institute for Genomic Research and J. Sikela, University of 
Colorado). This analysis (data not shown) revealed that the 
16S rRNA clones isolated from 1 NIB did not correspond to. 



Final 

kpfo Complexity,* 1 No. of RNA Frequency per Component at frequency per 

Component 8 % b (pure) c kb species c species/ % * M 8 Cot 5.5 t h % species % 

I 16 10 96 36 044 6\15 07 O02 

II 46 0.165 5,800 2,150 0.02 0.10 44.2 0.02 
III 38 0.0079 122,000 45,000 0.0008 0.0048 55J 0.0012 

"The experimental data of pseudo-first-order hybridization kinetics of cDNA tracer, which was synthesized from mouse brain poly(A) + 
polysomal mRNA and driven by its template (20), were solved by computer (unconstrained fit) into three kinetic components, using the 
EXCESS function of a least-squares curve-fitting program (21). 

The fraction of total occupied by each of the components is shown, after a minor correction (at completion, practically all of the tracer had 
reacted). These numbers (and all other numbers) in the table have been rounded. 

^The computer-calculated pseudo-first-order hybridization rate constant (kpto', M -1 *sec _1 ) for each component was divided by each of the values 
in column b, to derive /cpf 0 (pure). 

d The complexity (i.e. , length of unique sequence) was calculated by considering the data from a calibration kinetic standard: cDNA synthesized 
from encephalomyocarditis virus RNA (complexity, 9.7 kb) that was driven by its template [*pf 0 (pure), 99]. Thus, each of the values in column 
d is the ratio (99 x 9.7)//cpf 0 (pure). The complexity calculated for the rarest component (III) matches closely the values obtained from additional 
kinetic experiments using cDNA enriched for infrequent sequences (22, 23) and also the data of saturation experiments with single-copy genomic 
DNA tracer (24, 25). 

e The number of different RNA species in each component was estimated from their complexities by assuming that the average size of brain 
mRNA is 2.7 kb (26). A conjecture (26) that rare brain mRNAs are longer than this value (hypothetically 5 kb on average) has not been supported 
by hard evidence. 

'The initial average frequency of an individual mRNA species of each component in the entire population of mRNA molecules is the ratio of 
values in column b to those in column e. 

To assess the behavior of these kinetic components under the annealing conditions that we used for normalization (Co/, 5.5; length of 
complementary sequence in annealing strands, 0.2 kb), we first calculated the second-order reassociation rate constant (*«>; M~ h sec -1 ) for each 
component. For this calculation, we considered that the k^o of a single and pure kinetic component with a complexity of 1 kb reacting at a 
fragment length of 0.2 kb is 590 (27, 28). Thus each k w value is 590 divided by the complexity in column d. 

To determine the percentage of the leftover of each component in the population at Cot 5.5, we first used the values in column g to calculate 
the fraction remaining single-stranded, according to the equation C/Co = 1/(1 + kCot) and then normalized the derived values to a total of 100%. 
'The final average frequency of an individual mRNA species of each component is the ratio of values in column h to those in column e. 
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the predominant 16S rRNA species present in the IB library. 
Interestingly, in 17 of 19 16S rRNA clones sequenced from 
the IB library, the position of the A tract was the same as that 
present in the mature 16S rRNA. In contrast, all 8 clones 
sequenced from the ^IB library represented truncated ver- 
sions of the 16S rRNA, in which different lengths of the 3' 
terminal sequence were absent. Such truncated clones are 
under represented in the IB library (2 of 19). Therefore, their 
frequency was increased by normalization, as expected, 
while the 16S rRNA clones of the most prevalent form were 
reduced. It is likely that the shorter clones represent bona 
fide copies of naturally occurring truncated 16S rRNA mol- 
ecules (ref. 31-33; to be discussed elsewhere). 

Database searches (both blastn and blastx; ref. 17) 
revealed that of the 183 cDNAs examined, 152 (83%) were 
unknown (no hits), 15 (8.2%) corresponded to known human 
sequences, 5 (2.7%) were novel but related to known human 
sequences, 4 (2.2%) were homologous to mammalian se- 
quences, and 7 (3.8%) were homologous to known sequences 
from various nonmammalian organisms. 

In contrast to these results, when 1633 randomly picked 
clones from the non-normalized IB library were sequenced 
mostly (88%) from the 5' end, the percentage of unknown 
sequences was significantly lower than in our case (63%), 
while about 30% of the clones were sequenced twice or more 
(up to 50) times (10). Similar results were obtained by 
sequencing 493 random IB clones exclusively from the 3' end 
(18). Of the initially abundant cDNAs, which were sequenced 
multiple times in both of these studies, those encoding 
elongation factor la, a-tubulin, 0- tubulin, myelin basic pro- 
tein, and -y^actin (corresponding to our probes 1-4 and 7; Fig. 
2) were absent from the pool of 187 clones that we examined. 
Moreover, only 15 of the unique 183 clones that we se- 
quenced from the l NlB library (8%) had been previously 
identified in the collection of the sequenced 1633 IB clones. 

Eighteen of the unknown cDNAs that we sequenced (10% 
of the total clones) carried Alu repetitive elements (6 at the 5' 
end; 11 at the 3' end; and 1 at both ends). Thus, as previously 
observed (8), the frequency of cDNAs containing Alu repeats 
is not reduced by normalization. This phenomenon can be 
attributed to sequence heterogeneity among Alu family mem- 
bers, which are able to form imperfect hybrids that probably 
cannot bind to HAP. However, this is not a disadvantageous 
property, since it prevents elimination of rare A/w-carrying 
cDNAs from the population. 

To assess whether the normalization procedure had 
skewed the distribution of lengths favoring shorter cDNA 
clones, Southern blots of released inserts from the IB, ^IB, 
and 2 NIB plasmids were hybridized with several of the cDNA 
probes used in Fig. 2 individually. The results (not shown) 
demonstrated that the intensity of hybridization signals var- 
ied as expected, but the size of each hybridizing fragment 
remained the same. 

Note. Sasaki et al. (34) have described an alternative normalization 
procedure, in which a cDNA library was constructed following 
depletion of abundant mRNA species by sequential cycles of hy- 
bridization to matrix-bound cDNA. However, this procedure does 
not seem to be more advantageous than ours, while its actual 
practical potential remains to be assessed, as the putative normalized 
library was not adequately characterized. 
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ABSTRACT The recent ability to sequence whole genomes 
allows ready access to all genetic material. The approaches 
outlined here allow automated analysis of sequence for the 
synthesis of optimal primers in an automated multiplex 
oligonucleotide synthesizer (AMOS). The efficiency is such 
that all ORFs for an organism can be amplified by PCR. The 
resulting amplicons can be used directly in the construction of 
DNA arrays or can be cloned for a large variety of functional 
analyses. These tools allow a replacement of single-gene 
analysis with a highly efficient whole-genome analysis. 



The genome sequencing projects have generated and will 
continue to generate enormous amounts of sequence data. The 
genomes of Saccharomyces cerevisiae, Escherichia coli, Hae- 
mophilus influenzae (1), Mycoplasma genitalium (2), and Meth- 
anococcus jannaschii (3) have been completely sequenced. 
Other model organisms have had substantial portions of their 
genomes sequenced as well, including the nematode Caeno- 
rhabditis elegans (4) and the small flowering plant Arabidopsis 
thaliana (5). This massive and increasing amount of sequence 
information allows the development of novel experimental 
approaches to identify gene function. 

One standard use of genome sequence data is to attempt to 
identify the functions of predicted open reading frames 
(ORFs) within the genome by comparison to genes of known 
function. Such a comparative analysis of all ORFs to existing 
sequence data is fast, simple, and requires no experimentation 
and is therefore a reasonable first step. While finding sequence 
homologies/motifs is not a substitute for experimentation, 
noting the presence of sequence homology and/or sequence 
motifs can be a useful first step in finding interesting genes, in 
designing experiments and, in some cases, predicting function. 
However, this type of analysis is frequently uninformative. For 
example, over one-half of new ORFs in S. cerevisiae have no 
known function (6). If this is the case in a well studied organism 
such as yeast, the problem will be even worse in organisms that 
are less well studied or less manipulable. A large, experimen- 
tally determined gene function database would make homol- 
ogy/motif searches much more useful. 

Experimental analysis must be performed to thoroughly 
understand the biological function of a gene product. Scaling 
up from classical "cottage industry" one-gene-oriented ap- 
proaches to whole-genome analysis would be very expensive 
and laborious. It is clear that novel strategies are necessary to 
efficiently pursue the next phase of the genome projects — 
whole-genome experimental analysis to explore gene expres- 
sion, gene product function, and other genome functions. 
Model organisms, such as 5. cerevisiae, will be extremely 
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important in the development of novel whole-genome analysis 
techniques and, subsequently, in improving our understanding 
of other more complex and less manipulable organisms. 

The genome sequence can be systematically used as a tool 
to understand ORFs, gene product function, and other ge- 
nome regions. Toward this end, a directed strategy has been 
developed for exploiting sequence information as a means of 
providing information about biological function (Fig. 1). Ef- 
forts have been directed toward the amplification of each 
predicted ORF or any other region of the genome ranging 
from a few base pairs to several kilobase pairs. There are many 
uses for these amplicons — they can be cloned into standard 
vectors or specialized expression vectors, or can be cloned into 
other specialized vectors such as those used for two-hybrid 
analysis. The amplicons can also be used directly by, for 
example, arraying onto glass for expression analysis, for DNA 
binding assays, or for any direct DNA assay (7). As a pilot 
study, synthetic primers were made on the 96-well automated 
multiplex oligonucleotide synthesizer (AMOS) instrument (8) 
(Fig. 2). These oligonucleotides were used to amplify each 
ORF on yeast chromosome V. The current version of this 
instrument can synthesize three plates of 96 oligonucleotides 
each (25 bases) in an 8-hr day. The amplification of the entire 
set of PCR products was then analyzed by gel electrophoresis 
(Fig. 3). Successful amplification of the proper length product 
on the first attempt was 95%. This project demonstrates that 
one can go directly from sequence information to biological 
analysis in a truly automated, totally directed manner. 

These amplicons can be incorporated directly in arrays or 
the amplicons can be cloned. If the amplicons are to be cloned, 
novel sequences can be incorporated at the 5' end of the 
oligonucleotide to facilitate cloning. One potential problem 
with cloning PCR products is that the cloned amplicons may 
contain sequence alterations that diminish their utility. One 
option would be to resequence each individual amplicon. 
However, this is expensive, inefficient, and time consuming. A 
faster, more cost-effective, and more accurate approach is to 
apply comparative sequencing by denaturing HPLC (9). This 
method is capable of detecting a single base change in a 2-kb 
heteroduplex. Longer amplicons can be analyzed by use of 
appropriate restriction fragments. If any change is detected in 
a clone, an alternate clone of the same region can be analyzed. 
Modifying the system to allow high throughput analysis by 
denaturing HPLC is also relatively simple and straightforward. 

If amplicons are used directly on arrays without cloning, it 
is important to note that, even if single PCR product bands are 
observed on gels, the PCR products will be contaminated with 
various amounts of other sequences. This contamination has 
the potential to affect the results in, for example, expression 
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Fig. 1. Overview of systematic method for isolating individual 
genes. Sequence information is obtained automatically from sequence 
databases. The data are input into primer selection software specifi- 
cally designed to target ORFs as designated by database annotations. 
The output file containing the primer information is directly read by 
a high-throughput oligonucleotide synthesizer, which makes the oli- 
gonucleotides in 96-well plates (AMOS, automated multiplex oligo- 
nucleotide synthesizer). The forward and reverse primers are synthe- 
sized in the same location on separate plates to facilitate the down- 
stream handling of primers. The amplicons are generated by PCR in 
96-well plates as well. 

analysis. On the other hand, direct use of the amplicons is 
much less labor intensive and greatly decreases the occurrence 
of mistakes in clone identification, a ubiquitous problem 
associated with large clone set archiving and retrieving. 

Any large-scale effort to capture each ORF within a genome 
must rely on automation if cost is to be minimized while 
efficiency is maximized. Toward that end, primers targeting 
ORFs were designed automatically using simple new scripts 
and existing primer selection software. These script-selected 
primer sequences were directly read by the high-throughput 
synthesizer and the forward and reverse primers were synthe- 
sized in separate plates in corresponding wells to facilitate 
automated pipetting and PCR amplifications. Each of the 
resulting PCR products, generated with minimum labor, con- 
tains a known, unique ORF. 

Large-scale genome analysis projects are dependent on 
newly emerging technologies to make the studies practical and 
economically feasible. For example, the cost of the primers, a 
significant issue in the past, has been reduced dramatically to 
make feasible this and other projects that require tens of 
thousands of oligonucleotides. Other methods of high- 
throughput analysis are also vital to the success of functional 
analysis projects, such as microarraying and oligonucleotide 
chip methods (10-14). 

Changes in attitude are also required. One of the major costs 
of commercial oligonucleotides is extensive quality control 
such that virtually 100% of the supplied oligonucleotides are 
successfully synthesized and work for their intended purpose. 
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Fig. 2. Overall approach for using database of a genome to direct 
biological analysis. The synthesis of the 6,000 ORFs (orfs) for each 
gene of 5. cenvisiae can be used in many applications utilizing both 
cloning and microarraying technology. 

Considerable cost reduction can be obtained by simply de- 
creasing the expected successful synthesis rate to 95-97%. One 
can then achieve faster and cheaper whole genome coverage by 
simply adding a single quality control at the end of the 
experiment and batching the failures for resynthesis. 

The directed nature of the amplicon approach is of clear 
advantage. The sequence of each ORF is analyzed automati- 
cally, and unique specific primers are made to target each 
ORF. Thus, there is relatively little time or labor involved— for 
example, no random cloning and subsequent screening is 
required because each product is known. In the test system, 
primers for 240 ORFs from chromosome V were systematically 
synthesized, beginning from the left arm and continuing 
through to the right arm. At no point was there any manual 
analysis of sequence information to generate the collection. In 
many ways, now that the sequence is known, there is no need 
for the researcher to examine it. 

These amplicons can be arrayed and expression analysis can 
be done on all arrayed ORFs with a single hybridization (10). 
Those ORFs that display significant differential expression 
patterns under a given selection are easily identified without 
the laborious task of searching for and then sequencing a clone. 
Once scaled up, the procedure provides even greater returns 
on effort, because a single hybridization will ultimately provide 
a "snapshot" of the expression of all genes in the yeast genome. 
Thus, the limiting factor in whole genome analysis will not be 
the analysis process itself, but will instead be the ability of 
researchers to design and carry out experimental selections. 

Current expression and genetic analysis technologies are 
geared toward the analysis of single genes and are ill suited to 
analyze numerous genes under many conditions. Additional 
difficulties with current technologies include: the effort and 
expense required to analyze expression and make mutants, the 
potential duplication of effort if done by different laboratories, 
and the possibility of conflicting results obtained from differ- 
ent laboratories. In contrast, whole genome analysis not only 
is more efficient, it also provides data of much higher quality; 
all genes are assayed and compared in parallel under exactly 
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Fig. 3. Gel image of amplifications. Using the method described in Fig. 1, amplicons were generated for ORFs of 5. cerevisiae chromosome 
V. One plate of 96 amplification reactions is shown: 



the same conditions. In addition, amplicons have many appli- 
cations beyond gene expression. For example, one recent 
approach is to incorporate a unique DNA sequence tag, 
synthesized as part of each gene specific primer, during 
amplification. The tags or molecular bar codes, when reintro- 
duced into the organism as a gene deletion or as a gene clone, 
can be used much more efficiently than individual mutations 
or clones because pools of tagged mutants or transformants 
can be analyzed in parallel. This parallel analysis is possible 
because the tags are readily and quantitatively amplified even 
in complex mixtures of tags (13). 

These ORF genome arrays and oligonucleotide tagged 
libraries can be used for many applications. Any conventional 
selection applied to a library that gives discrete or multiple 
products can use these technologies for a simple direct read- 
out. These include screens and selections for mutant comple- 
mentation, overexpression suppression (15, 16), second-site 
suppressors, synthetic lethality, drug target overexpression 
(17), two-hybrid screens (18), genome mismatch scanning (19), 
or recombination mapping. 

The genome projects have provided researchers with a vast 
amount of information. These data must be used efficiently 
and systematically to gain a truly comprehensive understand- 
ing of gene function and, more broadly, of the entire genome 
which can then be applied to other organisms. Such global 
approaches are essential if we are to gain an understanding of 
the living cell. This understanding should come from the 
viewpoint of the integration of complex regulatory networks, 
the individual roles and interactions of thousands of functional 
gene products, and the effect of environmental changes on 
both gene regulatory networks and the roles of all gene 
products. The time has come to switch from the analysis of a 
single gene to the analysis of the whole genome. 

Support was provided by National Institutes of Health Grants 
R37H60198 and P01H600205. 
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DNA array technology makes h possible to rapidly genotype bdividuab or quantify the expression 
of thousands of genes on a single fitter or glass sUde, and holds enormous potential in toxicologic 
applications; This potential led to a U.S. Environmental Protection Agessey-spoosorcd workshop 
tided "Application of Microarrsy* to Toxicology" on 7-8 January 1999 in Roearcb Triangle Park. 
North Carolina. In addition to providing state-of-the-art information on the application of DNA or 
gene nucroarrays. the workshop catalyzed the formation of several collaborations, committees, and 
user's groups throughout the Research Triangle Park area and beyond. Potential application of 
nucroarravs to toxicologic research and risk assessment i n c lude genome-wide expression analyses to 
identify gene-expression networks and toxicant-specific signatures that can be used to define mode 
of action, for exposure assessment, and for environmental m o nit o rin g. Arrays may also prove useful 
for monitoring genetic variability and its relationship to toxicant susceptibility in human popula- 
tions. Key words: DNA arrays, gene arrays, nucroarrays, toxicology. Environ Health Penpeet 
107:681-685 (1999). [Online 6 July 19991 
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Decoding the genetic blueprint is a dream that 
often manifold returns in terms of understand- 
ing how organisms develop and function in an 
orten hostile environment. With the rapid 
advances in molecular biology over the last 30 
vears, the dream has come a step closer to reali- 
ty. Molecular biologists now have the ability to 
elucidate the composition of any genome. 
Indeed, almost 20 genomes have already been 
sequenced and more than 60 are currendy 
under way. Foremost among these is the 
Human Genome Mapping Project. However, 
the genomes of a number of commonly used 
laboratory species are also under intensive 
investigation, including yeast. Arabidopsis. 
maize, rice, zebra fish, mouse, rat. and dog. It 
is widely expected that the completion or such 
programs will facilitate the development of 
many powerful new techniques and approach- 
es to diagnosing ana creating geneucailv and 
cn\nronmentally induced diseases which afflict 
mankind. However, the vast amount of data 
being generated by genome mapping will 
require new high-throughput technologies to 
investigate the function of the millions of new 
genes mat are being repo r ted. Among the most 
widely heralded of the new functional 
genomic technologies are DNA arrays, which 
represent perhaps the most anticipated new 
molecular biology technique since polymerase 
chain reaction (PCR). 

Arrays enable the study of literally thou- 
sands of genes in a single experiment. The 
potential importance of arrays is enormous and 
has been highlighted by the recent publication 
of an entire Nature Generic supplement dedi- 
cated to the technology (7). Despite this huge 
surge of interest, DNA arrays are still little used 
and largely unproven. as demonstrated by the 
high ratio of review and press articles to actual 
data papers. Even so. the potential they offer 



has driven venture capitalists into a frenzy of 
investment and many new companies are 
springing up to daim a share of this rapidly 
developing market. 

The U.S. Environmental Protection 
Agency (EPA) is interested in applying DNA 
array technology to ongoing toxicologic stud- 
ies. To learn more about the current state of 
the technology, the Reproductive Toxicology 
Division (RTD) of the National Health and 
Environmental Effects Research Laboratory 
(NHEERL: Research Triangle Park, NO 
hosted a workshop on "Application of 
Microarrays to Toxicology" on 7-8 January 
1999 in Research Triangle Park. North 
Carolina. The workshop was organized by 
David Dix, Robert Kavtock. and John Rockett 
of the RTD/NHEERL. Twentv-rwo intra- 
mural and extramural scientists from govern- 
ment, acaoemia* and uitiustrv snareo tniorma- 
tion. data* and opinions on the current and 
future applications for this exciting new tech- 
nology. The workshop had more than 150 
attendees, including researchers, students, and 
— administrators from the £PA. the- National 
Institute of Environmental Health Sciences 
(NIEHS), and a number of other establish- 
ments from Research Triangle Park and 
beyond. Presentations ranged from the tech- 
nology behind array production through the 
sharing of actual experimental dan and projec- 
tions on the future importance and applica- 
tions of arrays. The information contained in 
the workshop presentations should provide aid 
and insight into arrays in general and their 
application to toxicology in particular. 

Array Elements 

In the context of molecular biology, the word 
"array" is normally used to refer to a series of 
DNA or protein elements firmly attached in 



a regular pattern to some kind of supportive 
medium. DNA array is orten used inter- 
changeably with gene array or microarray. 
Although not formally defined, microarray is 
generally used to describe the higher density 
arrays typically printed on glass chips. The 
DNA elements that make up DNA arrays 
can be oligonucleotides, partial gene 
sequences, or rull-iength cDNAs. Companies 
ottering pre-made arrays that contain less 
than ndl-lengrh clones normally use regions 
of the genes which are specific to that gene to 
prevent false positives arising through cross- 
hybridization. Sequence verification of 
cDNA clone identity is necessary' because of 
errors in identifying specific clones from 
cDNA libraries and databases. P remade 
DNA array's printed on membranes are cur- 
rently or imminently available for human, 
mouse, and rat. In most cases they contain 
DNA sequences representing several thou* 
sand different sequence dusters or genes as 
ddineated through the National Center for 
Biotechnology Information UniCene Project 
(2). Many of these different UniGene dusters 
(putative genes) are represented only by 
expressed sequence tags (ESTs). 

Array Printing 

Arrays are typically printed on one of two 
types of support matrix. Nylon membranes 
are used by most off-the-shdf array providers 
such as Clontcch Laboratories, Inc. 
(Palo Alto, CA). Genome Systems, Inc. (St. 
Louis. MO), and Research Genetics. Inc. 
iHuntsville. AL). Microarravs such as those 
produced by Anymernx. Inc. i Santa Gara. 
CA). Incyte Pharmaceuticals. Inc. (Palo Alto, 
CA). and many do-it-yourself (DIY) arraying 
groups use glass wafers or slides. Although 
standard microscope slides may be used, they 
must be preprepared to facilitate sticking 
of the DNA to the glass. Several different 
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coatinp have been successfully used, includ- 
ing silane and lysine. The coating of slides 
can easily be carried out in the laboratory, 
but mam- prefer the convenience of precoatec 
slides available from suppliers. 

Once the support matrix has been pre- 
pared, the DNA elements can be applied by 
several methods. Afiyroetrix. Inc., has devel- 
oped a unique photolithographic technology 
for attaching oligonucleotides to glass wafers. 
More commonly. DNA is applied by either 
noncontact or contact printing. Koncontact 
printers can use thermal, solenoid, or piezoelec- 
tric technology to spray aliquots of solution 
onto the support matrix and may be used to 
produce slide or membrane-based arrays. 
Cartesian Technologies, Inc. (Irvine. CA) has 
developed nQUAD technology for use in its 
PixSys printers. The system couples a syringe 
pump with the microsolenoid valve, a combi- 
nation that provides rapid quantitative dispens- 
ing of nanoUter volumes (down to 42 nL) over 
a variable volume range. A different approach 
to noncontact printing uses a solid pin and ring 
combination (Genetic MicroSystems, Inc., 
Wobum, MA). This system (Figure I ) allows a 
broader range of sample, including cell suspen- 
sions and particulates, because the printing 
head cannot be blocked up in the same way as 
a spray nozzle. Fluid transfer is controlled in 
this system primarily by the pin dimensions 
and the force of deposition, although the 
nature of the support matrix and the sample 
will also affect transfer to some degree. 

In contact printing, the pin head is dipped 
in the sample and then touched to the support 
matrix to deposit a small aliquot. Split pins 
were one of the first contact-printing devices 
to be reported and are the suggested format 
for DIY arrayers, as described by Brown (3). 
Split pins are small metal pins with a precise 
groove cut vertically in the middle of the pin 
tip. In this system. 1—48 split pins are posi- 
tioned in the pin-head. The split pins work by 
simpie capillary action, not unlike a fountain 
pen — when the pin heads are dipped in the 
sample, liquid is drawn into the pin groove, A 
small (fixed) volume is then deposited each 
time the split pins are gently touched to 
the support matrix. Sample (100-500 pL 
depending on a variety of parameters) can be 
deposited on multiple slides before refilling is 
required, and array densities of > 2,500 
spots/cm 2 may be produced. The deposit vol- 
ume depends on the split size, sample fluidi- 
ty, and the speed of printing. Split pins are 
relatively simple to produce and can be made 
in-house if a suitable machine shop is avail- 
able. Alternatively, they can be obtained 
directly from companies such as TeleChem 
International, Inc. (Sunnyvale. CA). 

Irrespective of their source, printers 
should be run through a preprint sequence 
prior to producing the actual experimental 
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arrays: the first 100 or so spots of a new run 
tend to be somewhat variable. Factors effect- 
ing spot reproducibility include slide treat- 
ment homogeneity, sample differences, and 
instrument errors. Other factors that come 
into play include dean eiection of the drop 
and clogging (nQUAD printing) and 
mechanical variations and long-term alter- 
ation in print-head surface of solid and split 
pins. However, with careful preparation it is 
possible to get a coefficient of variance for 
spot reproducibility below 10%. 

One potential printing problem is sample 
carryover. Repeated washing, blotting, and 
drying (vacuum) of print pins between samples 
is normally effective at reducing sample carry- 
over to negligible amounts. Printing should 
also be carried out in a controlled environ- 
ment. Humidified chambers are available in 
which to place printers. These help prevent 
dust contamination and produce a uniform 
drying rate, which is important in determining 
spot size, quality, and reproducibility. 

In summary, although several printing 
technologies are available, none are par- 
ticularly outstanding and the bottom line 
is that they are still in a relatively early stage 
of evolution. 

Array Hybridization 

The hybridization protocol is. practically 
speaking, relatively straightforward and those 
with previous experience in blotting should 
have little difficulty. Array hybridizations 
are, in essence, reverse Southern/Northern 
blots — instead of applying a labeled probe to 
the target population of DNA/RNA. the 
labeled population is applied to the probds). 
With membrane-based arrays, the control and 
treated mRNA populations are normally con- 
verted to cDNA arid labeled with isotope (eg.. 
3 *P) in the process. These labeled populations 
are tnen nybridized independently to parallel 
or senai arrays and the hvbridizanon sisnal is 
detected with a phosporimager. A less com- 
monly used alternative to radioactive probes is 
enzymatic detection. The probe may be 
biotinylated, haprenyiated l _or have alkaline 
phosphatase/horseradish peroxidase attached. 
Hybridization is detected by enzymatic reac- 
tion yielding a color reaction (4). Differences 
in hybridization signals can be detected by eye 
or, more accurately, with the help of digital 
imaging and commercially available software. 
The labeling of the test populations for slide- 
based microarrays uses a slightly different 
approach. The probe typically consists of two 
samples of polyA* RNA (usually from a treated 
and a conool population) that are convened -to 
cONA; in the process each is labeled with a 
different fluor. The independently labeled 
probes are then mixed together and hybridized 
to a single microarray slide and the resulting 
combined fluorescent signal is scanned. After 




figure 1. Genetic Microsystems (Woburn. MA) pm 
ring system for priming arrays. Tht pm rmg com- 
bination consists of a circular open rmg oriented 
parallel to the samole solution, with a vertical pm 
centered over the nng. When tht ring is dipped 
into a solution and lifted, it withdraws an aliquot 
of sample held by surface tension. To spot tht 
sample, tht pm is dnven down through tht ring 
and a portion of the solution is transferred to tht 
bottom of tht pm. Tht pm continues to movt 
downward until tht ptndant drop of solution 
makes contact with tht underlying surltet. Tht 
pin is then lifted, and gravity and surface tension 
caust dtposition of tht spot onto tht array. 
Figure from Rowers et at. ( M|, with ptrmission 
from Genetic Microsystems. 

normalization, it is possible to determine the 
ratio of fluorescent signals from a single 
hybridization of a slide-based microarray. 

cONA derived from control and treated 
populations of RNA is most commonly 
hybridized to arrays, although subtractive 
hybridization or differential display reactions 
may also be used. Fluorophore- or radiola- 
beled nucleotides are directly incorporated 
into the cDNA in the process of converting 
RNA to cDNA. Alternatively. 5' end-boded 
primers may be used for cONA synthesis. 
These are labeled with a fluorophore for 
direct visualization of the hybridized array. 
Alternatively, biotin or a hapten may be 
attached to the primer, in which case fluor* 
labeled streptavidin or antibody must be 
applied before a signal can be ge n er at e d . The 
most commonly used fluorophores at present 
are cyanine (Cy)3 and Cy5 (Amersham 
Pharmacia Biotech AB, Uppsala, Sweden). 
However, the relative expense of these fluo- 
rescent conjugates has driven a search for 
cheaper alternatives. Fluorescein, rrjodamine, 
and Texas red have all been used, and 
companies such as Molecular Probes, Inc. 
(Eugene, OR) are developing a series of 
labeled nucleotides with a wide range of en* 
cation and emission s pec tr a which may prove 
to function as well as the Cy dyes. 



Tabta 1. Advantages and disadvantages of different microarray scanning systems. 
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Analysis of DNA Microarrays 

Membrane-based arrays arc normally analyzed 
on film or with a phosphorimager, whereas 
chip-based arrays require more specialized scan- 
ning devices. These can be divided into three 
main groups: the charge-coupled device camera 
systems, the nonconrocal laser scanners, and the 
conrocaJ laser scanners. The advantages and dis- 
advantages of each system are listed in Table 1. 

Because a typical spot on a microanay can 
contain > 10 8 molecules, it is dear that a large 
variation in signal strength may occur. 
Current scanners cannot work across this 
many orders of magnitude (4 or 5 is more typ- 
ical). However, the scanning parameters can 
normally be adjusted to collect more or less 
signal, such that two or three scans of the same 
array should permit the detection of rare and 
abundant genes. 

When a microanay is scanned, die fluores- 
cent images are captured by software normally 
included with the scanner. Several commercial 
suppliers provide additional software for quan- 
tifying array images, but the software tools are 
constantly evolving to meet the developing 
needs of researchers, and it is prudent to 
define one's own needs and clarify the exact 
capabilities of the software before its purchase. 
Issues that should be considered include the 
following: 

• Can the software locate offset spots? 

• Can it quantitate across irregular hybridiza- 
tion signals? 

• Can the arrayed genes be programmed in for 
easy identification and location? 

• Can the software connect via the Internet to 
databases containing further information on 
the gene(s) of interest? 

One of the key issues raised at the work- 
shop was the sensitivity of microarray technol- 
ogy. Experiments by General Scanning. Inc. 
^'atenown. MA), have shown thai by using 
the Cy dyes and their scanner, signal can be 
detected down to levels of < 1 fluor molecule 
per square micrometer, which translates to 
detecting a rare message at approximately one 
copy per cell or less. 

Array Applications 

.Although arrays are an emerging technology 
certain to undergo improvement and 
alteration, they have already been applied use- 
fully to a number of model systems. Arrays are 
at their most powerful when they contain the 
entire genome of the species they are being 
used to study. For this reason, they have strong 
support among researchers utilizing yeast and 
Gienorhabdim elegant ($)- The genomes of 
both of these species have been sequenced and, 
in the case of yeast, deposited onto arrays for 
examination of gene expression (cs7). With 
both of these species, it is relatively easy to 
perturb individual gene expression. Indeed C 
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elegans knockouts can be made simply by 
soaking the worms in an antisense solution of 
the gene to be knocked out. 

By a process of systematic gene disrup- 
tion, it is now possible to examine the cause 
and effect relationships between different 
genes in these simple organisms. This kind of 
approach should help elucidate biochemical 
pathways and genetic control processes, 
deconvolute polygenic interactions* and 
define the architecture of the cellular network. 
A simple case study of how this can be 
achieved was presented by Butow [University 
of Texas Southwestern Medical Center. 
Dallas, TX (Figure 2)]. Although it is the 
phenotypic result of a single gene knockout 
that is being examined, the effect of such 
perturbation will almost always be polygenic 
Polygenic interactions will become increasing- 
ly important as researchers begin to move 
away from single gene systems when examin- 
ing the nature of toxicologic responses to 
external stimuli. This is especially important 
in toxicology because the phenorype pro- 
duced by a given environmental insult is 
never the result of the action of a single gene; 
rather, it is a complex interaction of one or 
multiple cellular pathways. Phenomena such 
as quantitative trait (the continuous vananor. 
of phenorype), eptsasis ithe enect of alleles of 
one or more genes on the expression of other 
genes), and penetrance (proportion of indi- 
viduals of a given genotype that display a par- 
ticular phenorype) will become increasingly 
evident and important as toxicologists push 
toward the ultimate goal of matching the 
responses of individuals to different 
environmental stimuli 

Analysis of the nanscriptome (the expres- 
sion level of all die genes in a given cell popula- 
tion) was a use of arrays addressed by several 
speakers. Unfortunately, current gene nomen- 
clature is often confusing in that single genes 
are allocated multiple names (usually as a result 
of independent discovery by different laborato- 
ries), and there was a call for standardization of 
gene nomenclature. Nevertheless, once a tran- 
scriptome has been assembled it can then be 
transfer ted onto arrays and used to screen any 
chosen system. The EPA Micro Array 
Consortium (EPAMAQ is assembling testes 



transoiptomes for human, rat. and mouse. In a 
slightly different approach. Nuwavstr et al. 10) 
describes how the NIEHS assembled what is 
effectively a "toxicoiogical transcriprome" — a 
library of human and mouse genes that have 
previously been proven or implicated in 
responses to toxicologic insults. Clontech 
Laboratories. Inc. (Palo Alto. CAh has begun a 
similar process by developing stress/toxicology 
filter arrays of rat. mouse, and human genes. 
Thus, rather than being tissue or cell specific* 
these stress/ toxicology arrays can be used across 
a variety of model systems to look for alter- 
ations in the expression of toxicologtcally 
important genes and define the new field of 
toxicogenomics. The potential to identify 
cant families based on tissue* or cell-specific 
gene expression could revolutionize drug test* 
ing. These molecular signatures or fingerprints 
could not only point to the possible 
toxicity/carcinogenicity of newly discovered 
compounds (Figure 3), but also aid in elucidat- 
ing their mechanism of action through identifi- 
cation of gene expression networks. By exten- 
sion, such signatures could provide easily iden- 
tifiable biomarkers to assess the d eg ree, tune, 
and nature of exposure. 

DNA arrays are primarily a tool for exam- 
ining differential gene expression in a given 
model. In this context thev are leieii e u to as 
closed systems because they lack the ability of 
other differential expression technologies, e^, 
differential display and subtractive hybridiza- 
tion, to detect previously unknown genes not 
present on the array. This would appear to 
limit the power of DNA arrays to the imagina- 
tions and preconceptions of die researcher in 
selecting genes previously characterized and 
thought to be involved in the model system. 
However, the various genome sequencing pro- 
jects have created a new category of 
sequence — the EST — that has partially molli- 
fied this deficiency. ESTs are cDNAs expressed 
in a given tissue that, although they may share 
some degree of sequence similarity to previous- 
ly characterized genes, have not been assigned 
specific genetic identity. By incorporating EST 
clones into an array, it is possible to monitor 
the expression of these unknown genes. This 
can enable the identification of previously 
uncharacterized genes that may have biologic 



significance in the model system. Filter arrays 
from Research Generics and slide arrays rrom 
Incyre Pharmaceuticals both incorporate large 
numbers of ESTs rrom a variety or' species. 

A further use or nucroarravs is the identifT* 
cation of single nucleotide polymorphisms 
(SNPsi. These genomic variations are abun- 
dant — dtev occur approximately every 1 kb or 
so— and are the basis of restriction fragment 
length polymorphism anaiysu used in forensic 
analysis. Asymetrix. Inc. designed chips that 
contain multiple repeats of the same gene 
sequence. Each position is present with all four 
possible bases. After the hybridization of the 
sample, the degree of hybridization to the dif- 
ferent sequences can be measured and the exact 
sequence of the target gene deduced. SXPs are 
thought to be of vital importance in drug 
metabolism and toxicology. For example, sin- 
gle base differences in the regulatory region or 
active site of some genes can account for huge 
differences in the activity of that gene. Such 
SN'Ps are thought to explain why some people 
are able to metabolize certain xenobiorics bet- 
ter than others. Thus, arrays provide a further 
tool for the roxicologist investigating the 
nature of susceptible subpopulations and toxi- 
cologic response. 

There are still many wrinkles to be ironed 
out before arrays become a standard tool for 
toxicologists. The main issues raised at the 
workshop by those with hands-on experience 
were the following: 

• Expense: the cost of purchasing/contracting 
this technology is still too great for many 
individual laboratories. 




Figure 2. Pottrroal effects of gent knockout within 
positively and negatively regulated gent expression 
networks, i, is limiting in wild type for expression of 
^ {A) A simple, two-component, linear regulatory 
network operating on gene where /, is a positive 
effector of ^ and j 0 is either a positive or negative 
effector of / t . This network could be deduced by 
examining the consequence of (fl) deleting /„ on the 
expression of i, and where the expression of L 
would be decreased or increased depending on 
whether j n was e positive or negative regulator. 
These and other eonnected components of even 
greater complexity could be revealed by genome* 
wide expression analysis. From Butow US. 
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• Clones: the logistics of identirying. obtaining, 
and maintaining a set of nonredundant. non- 
contaminated, sequence-verified species/cell-' 
rissue/rldd-specinc dones. 

• Use of inbred strains: where whole-organism 

moclc ' s arc ^ cin f * c 0* inbred 
strains is important to reduce the potentially 
confusing effects of the individual variation 
typically seen in outbred populations. 

• Probe the need tor relatively large amounts 
of RNA. which limits the type of sample 
(eg., biopsy) that can be used.* Also, different 
RNA extraction methods can give different 
results. 

• Specificity: the ability to discriminate accu- 
rately between dosery related genes (eg., the 
cytochrome ps50 family) and splice variants. 

• Quantitation: the quantitation of gene 
expression using gene arrays is still open to 
debate. One reason for this is the different 
incorporation of the labeling dyes. However, 
the main difficulty lies in knowing what to 
normalize against. One option is ttTindude a 
large number of so-called housekeeping genes 
in the array. However, the expression of these 
genes often change depending on the tissue 
and the toxicant, so it is necessary to charac- 
terize the expression of these genes in the 
modd system before utilizing them. This is 
dearly not a viable option when screening 
multiple new compounds. A second option 
is to indude on the array genes from a nonre- 
lated spedes (eg., a plant gene on an animal 
array) and to spike the probe with synthetic 
RNA(s) complementary to the geneis). 

• Rcprodudbility: this is sometimes question- 
able, and a figure of approximately two or 
three repeats was used as the minimum num- 
ber required to confirm initial findings. 



.Again, however, most peopic a*:v,vj;cj it.;. 
use of Northern biots or reverse tramc-ru^ 
PCR to connrm nnainss. 

• Sensitivity: concerns were voiced about :hr 
"number of target molecules that must be pre- 
sent in a sample ror them to be detected on 
the array. 

• EfBriency: reproducible identification of 1.5- 
to 2-fold differences in expression was report- 
ed, although the number of genes that 
undergo this level of change and remain 
undetected is open to debate. It is important 
that this level of detection be ultimately 
achieved because it is commonly perceived 
that some important transcription factors 
and thdr regulators respond at such low lev. 
eis. In most cases. > to Void was the mini- 
mum change that most were happy to 
accept 

• Bioinformaticj: perhaps the greatest concern 
was how to accuratdy interpret the data with 
the greatest accuracy and efficiency. The 
biggesr headache is trying to identity net- 
works of gene expression that arc common to 
different treatments or doses. The amount of 
data from a single experiment is huge It may 
be that, in the future, several groups individ» 
ually equipped with specialized software algo- 
rithms for studying their favorite genes or 
gene systems will be able to share the same 
hybridized chips. Thus, arrays could usher in 
a new perspective on collaboration and the 
sharing of data. 

EPAMAC 

Perhaps the main reason most scientists are 
unable to use array technology is the high cost 
involved, whether buying orT-che-shdf mem- 
branes, using contract printing services, or 



Toxicant family 
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; producing chips in-house. In view of this, 
researched at the RTD/NHEERL initiated 
the EPAMAC This consortium brings 
loeether scientists from the EPA and a num- 
ber of extramural labs with the aim of devel- 
oping microarray capability through the shar- 
ing of resources and data. EPAMAC 
researchers are primarily interested in the 
developmental and toxicologic changes seen 
in testicular and breast tissue, and a portion 
of the workshop was set aside for EPAMAC 
members to share their ideas on how the 
experimental application of microarrays could 
facilitate their research. One of the central 
areas of interest to EPAMAC members is the 
effect of xenobiotics on male fertility and 
reproductive health. Of greatest concern is 
the effect of exposure during critical periods 
of development and germ cell differentiation 
(9). and how this may compromise sperm 
counts and quality following sexual matura- 
tion 170). As wed as spermatogenic tissue, 
there is also interest in how residual mRNA 
found in mature sperm {ID could be used as 
an indicator of previous xenobiotic effects (it 
is easier to obtain a semen sample than a tes- 
ticular biopsy). Arrays will be used to examine 
and compare the effect of exposure to heat 
and chemicals in testicular and epididymal 
gene expression profiles, with the aim of 
establishing relationships/associations 
between changes in developmental landmarks 
and the effects on sperm count and quality. 
Ouster, pattern, and other analysis of such 
data should help identify hidden relationships 
between genes that may reveal potential 
mechanisms of action and uncover roles for 
genes with unknown functions. 

Summary 

The full impact of DNA arrays may not be 
seen for several vears. but the interest shown at 
this resonal workshop indicates rite high level 
of interest that they foster. .Apart rrom educat- 
ing and advertising the various technologies in 
this field, this workshop brought together a 
number of researchers from the Research 
Triangle Park area who are already using DNA 
arrays. The interest in sharing ideas and experi- 
ences led to the initiation of a Triangle array 
user s group. 
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Array technology is still in its infancy. This 
means that the hardware is still improving and 
there is no current consensus for standard pro- 
cedures, quantitation, and interpretation. 
Consistency in spotting and scanning arrays is - 
not yet optimized, and this is one of the most 
critical requirements of any experiment. In — 
addition, one of the dark regions of array tech- 
nology — strife in the courts over who owns 
what portions of it — has further muddled die 
future and is a potential barrier toward the 
development of consensus procedures. 

Perhaps the greatest hurdle for the applica- 
tion of arrays is the actual interpretation of 
data. No specialists in bioinformatio attended 
the workshop, largely because they are rare and 
because as yet no one seems dear on the best 
method of approaching data analysis and inter- 
pretation. Cross-referencing results from mul- 
tiple experiments (ome* dose, repea t s, different 
animals, di fferen t species) to identify common- 
ly expressed genes is a great challenge. In most 
cases, we are still a long way from understand- 
ing how the expression of gene X is related to 
the expression of gene K and ordering gene 
expression to delineate causal relationships. 

To the ordinary scientist in the typical lab- 
oratory, however, the most immediate prob- 
lem is a lack of affordable instrumentation. 
One can purchase premade membranes at 
relatively affordable prices. Although these 
may be useful in identifying individual genes 
to pursue in more detail using other methods, 
the numbers that would be required for even a 
small routine toxicology experiment prohibit 
this as a truly viable approach. For the toxicol- 
ogist. there is a need to earn* out multiple 
experiments — dose responses, time curves, 
multiple animals, and repeats. Glass-based 
DNA arrays are most attractive in this context 
because they can be prepared in large batches 
from the same DNA source and accommo- 
date control and treated samples on the same 
chip. Another problem with current off-the- 
shelf arrays is that they often do not contain 
one or more of the prricular genes a group is 
interested in. One alternative is to obtain 
and/or produce a set of custom clones and 
have contract printing of membranes or slides 
carried out by a company such as Genomic 
Solutions, Inc. (Ann Arbor, MI). This approach 
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is less expensive than Living ou: *Mr::i. : 
one s own entire svstem. although ar »or.r 
point it might make economy >en$c to or:::: 
one s own arrays. 

Finally. DNA arrays are currently a team 
effort. They are a technology that uses a wide 
.-range of skills including enpneenng. statistics, 
molecular biology, chemistry, and biointor- 
matics. Because most individuals are skilled in 
only one or perhaps two of these areas, it 
appears that success with arrays may be best 
expected by teams of collaborators consisting 
of individuals having each of these skills. 

Those considering array applications mav 
be amused or goaded on bv the toliowinc 
quote from Fortune magazine 1 12l: 

Microprocessors have reshaped our economy, 
spawned vast comma and changed the wav we live. 
Gene chips could be even bigger. 

Although this comment may have been 
designed to excite the imagination rather than 
accurately reflect the truth, it is fair to say that 
the age of functional genomics is upon us. 
DNA arrays look set to be an important tool in 
this new age of biotechnology and will likdy 
contribute answers to some of toxicology's 
most fundamental questions. 
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1. An important feature of the work of many molecular biologists is identifying which 
genes are switched on and off in a cell under different environmental conditions or 
subsequent to xenobiotic challenge. Such information has many uses, including the 
deciphering of molecular pathways and facilitating the development of new experimental 
and diagnostic procedures. However, the student of gene hunting should be forgiven for 
perhaps becoming confused by the mountain of information available as there appears to be 
almost as many methods of discovering differentially expressed genes as there- are research 
groups using the technique. 

2; The aim of this review was to clarify the main methods of differential gene expression 
analysis and the mechanistic principles underlying them. Also included is a discussion on 
some of the practical aspects of using this technique. Emphasis is placed on the so-called 
* open * systems, which require no prior knowledge of the genes contained within the study 
model. Whilst these will eventually be replaced by 'closed* systems in the study of human, 
mouse and other commonly studied laboratory animals, they will remain a powerful tool for 
those examining less fashionable models. 

3. The use of suppression -PCR subtract! ve hybridization is exemplified in the 
identification of up- and down-regulated genes in rat liver following exposure to pheno- 
barbital, a well-known inducer of the drug metabolizing enzymes. 

4. Differential gene display provides a coherent platform for building libraries and 
microchip arrays of 'gene fingerprints* characteristic of known enzyme inducers and 
xenobiotic toxicants, which may be interrogated subsequently for the identification and 
characterization of xenobiotics of unknown biological properties. 
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Introduction 

It is now apparent that the development of almost all cancers and many non- 
neoplastic diseases are accompanied by altered gene expression in the affected cells 
coffipaxed to their normal state (Hunter 199U Wynfbrd-Thornas 1991, Vogelsrein 
and Kinzler 1993, Semenza 1994, Cassidy 1995, K*einjan and Van Hegningen 1998). 
Such changes also occur in response to external stimuli such as pathogenic micro- 
organisms (Rohn et aL 1996, Singh et aL 1997, Griffin and Krishna 1998, Lunney 
1998) and xenobiotics (Sewall et aL 1995, Dogra et aL 1998, Ramana and Kohli 
1998), as well as during the development of undifferentiated cells (Hecht 1998, 
Rudin and Thompson 1998, Schneider-Maunoury et aL 1998). The potential 
medical and therapeutic benefits of understanding the molecular changes which 
occur in any given cell in progressing from the normal to the 'altered' state are 
enormous. Such profiling essentially provides a._*_fingerprint ' of each step of a 
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cell's development or response and should help in the elucidation of specific and 
sensitive biomarkers representing, for example, different types of cancer or previous 
exposure to certain classes of chemicals that are enzyme inducers. 

In drug metabolism, many of the xenobiotic-metabolizing enzymes (including 
the well-characterized isoforms of cytochrome P450) are inducible by drugs and 
chemicals in man (Pelkohen et al. 1998), predominantly involving transcriptional 
activation of not only the cognate cytochrome P450 genes, but additional cellular 
proteins which may be crucial to the phenomenon of induction. Accordingly, the 
development of methodology to identify and assess the full complement of genes 
that are either up- or down- regulated' by inducers are crucial in the development of 
knowledge to understand the precise molecular mechanisms of enzyme induction 
and how this relates to drug action. Similarly, in the field of chemical-induced 
toxicity, it is now becoming increasingly obvious that most adverse reactions to 
drugs and chemicals are the result of multiple gene regulation, some of which are 
causal and some of which are caisually- related to the toxicological phenomenon per 
se. This observation has led to an upsurge in interest in gene-profiling technologies 
which differentiate between the control and toxin-treated gene pools in target tissues 
and is, therefore, of value in rationalizing the molecular mechanisms of xenobiotic- 
induced toxicity. Knowledge of toxin-dependent gene regulation in target tissues is 
not solely an academic pursuit as much interest has been generated in the 
pharmaceutical industry to harness this technology in the early identification of toxic 
drug candidates, thereby shortening the developmental process and contributing 
substantially to the safety assessment of new drugs. For example, if the gene profile 
in response to say a testicular toxin that has been well-characterized in vivo could be 
determined in the testis, then this profile would be representative of all new drug 
candidates which act via this specific molecular mechanism of toxicity, thereby 
providing a useful and coherent approach to the early detection of such toxicants. 
Whereas it would be informative to know the identity and functionality of all genes 
up/ down regulated by such toxicants, this would appear a longer term goal, as the 
majority of human genes have not yet been sequenced, far less their functionality 
determined. However, the current use of gene profiling yields a pattern of gene 
changes for a xenobiotic of unknown toxicity which may be matched to that of well- 
characterized toxins, thus alerting the toxicologist to possible in vivo similarities 
between the unknown and the standard, thereby providing a platform for more 
extensive toxicological examination. Such approaches are beginning to gain 
momentum, in that several biotechnology companies are commercially producing 
'gene chips* or 4 gene arrays' that may be interrogated for toxicity assessment of 
xenobiotics. These chips consist of hundreds/thousands of genes, some of which are 
degenerate-in the sense that not all of the genes are mechanistically-related to any 
one toxicological phenomenon. Whereas these chips are useful in broad-spectrum 
screening, they are maturing at a substantial L .rate f in that gene arrays are now 
becoming more specific, e.g. chips for the identification of changes in growth factor 
families that contribute to the aetiology and development of chemically-induced 
neoplasias. - ... . — 

Although documenting and explaining~these genetic changes presents a 
formidable obstacle to understanding the different mechanisms of development and 
disease progression, the technology is now avwkibleto begin attempting this difficult 
challenge. Indeed, several 'differential expression analysis' methods have been 
developed which facilitate the identification of gene products that demonstrate 
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altered expression in cells of one population compared to another. These methods 
have been used to identify differential gene expression in many situations, including 
invading pathogenic microbes (Zhao et al. 1998), in cells responding to extracellular 
and intracellular microbial invasion (Duguid and Dinauer 1990, Ragno et al 1997, 
Maldarelli et al. 1998), in chemically treated cells (Syed et al. 1997, Rockett et al. 
1999), neoplastic cells (Liang et aL 1992, Chang and Terzaghi-Howe 1998), 
activated cells (Gurskaya et al. 1996, Wan et aL 1996), differentiated cells (Hara et 
aL 1991, Guimaraes et aL 1995a, b), and different cell tvpes (Davis et al 1984 
Hedrick et aL 1984, Xhu et al. 1998). Although differential expression analvsis 
technologies are applicable to a broad range of models, perhaps their most important 
advantage is that, in most cases/absolutely no prior knowledge of the specific genes 
which are up- or down-regulated is required. 

The field of differential expression analysis is a large and complex one, with 
many techniques available to the potential user. These can be categorized into 
several methodological approaches, including: 

(1) Differential screening, 

(2) Subtractive hybridization (SH) (includes methods such as chemical cross- 
linking subtraction— CCLS, suppression-PCR subtractive hybridization— 
SSH, and representational difference analysis — RDA), 

(3) Differential display (DD), 

(4) Restriction endonuclease facilitated analysis (including serial analysis of gene 
expression— SAGE — and gene expression fingerprinting — GEF), 

(5) Gene expression arrays, and 

(6) Expressed sequence tag (EST) analysis. 

The above approaches have been used successfully to isolate differentially 
expressed genes in different model systems. However, each method has its own 
subtle (and sometimes not so subtle) characteristics which incur various advantages 
and disadvantages. Accordingly, it is the purpose of this review to clarify the 
mechanistic principles underlying the main differential expression methods and to 
highlight some of the broader considerations and implications of this very powerful 
and increasingly popular technique. Specifically, we will concentrate on the so- 
called 'open* systems, namely those which do not require any knowledge of gene 
sequences and, therefore, are useful for isolating unknown genes. Two 'closed' 
systems (those utilising previously identified gene sequences), EST analysis and the 
use of DNA arrays, will aiso- be consioered briefly for completeness/Whilst 
emphasis will often be placed on suppression PCR subtractive hybridization (SSH, 
the approach employed in this laboratory), it is the aim of the authors to highlight! 
wherever possible, those areas of common interest to those who use, or intend to use] 
differential gene expression analysis. 



Differential cDN A library screening (DS) 

Despite the development of multiple technological advances which have recently 
brought the field of gene expression profiling to the forefront of molecular analysis, 
recognition of the importance of differential gene expression and characterization of 
differentially expressed genes has existed for many years. One of the original 
approaches used to identify such genes was described 20 years ago by St John and 
Davis (1979). These authors developed a method, termed 'differential plaque filter 
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hybridization \ which was used to isolate galactose-inducible DNA sequences from 
yeast. The theory is simple: a genomic DNA library is prepared from normal, 
unstimulated cells of the test organism/tissue and multiple filter replicas are 
prepared. These replica blots are probed with radioactively (or otherwise) labelled 
complex cDNA probes prepared from the control and test cell mRNA populations. 
Those mRNAs which are differentially expressed in the treated cell population will 
show a positive signal only on the filter probed with cDNA from the treated cells. 
Furthermore, labelled cDNA from different test conditions can be used to probe 
multiple blots, thereby enabling the identification of mRNAs which are only up- 
regulated under certain conditions. For example, St John and Davis (1979) screened 
replica filters with acetate-, glucose- and galactose-derived probes in order to obtain 
genes induced specifically by galactose metabolism. Although groundbreaking in its 
time this method is now considered insensitive and time-consuming, as up to 2 
months are required to complete the identification of genes which are differentially 
expressed in the test population. In addition, there is no convenient way to check 
that the procedure has worked until the whole process has been completed. 

Subtractive Hybridization (SH) 

The developing concept of differential gene expression and the success of early 
approaches such as that described by St John and Davis (1979) soon gave rise to a 
search for more convenient methods of analysis. One of the first to be developed was 
SH, numerous variations of which have since been reported (see below). In general, 
this approach involves hybridization of mRNA/cDNA from one population (tester) 
to excess mRNA/cDNA from another (driver), followed by separation of the 
unhybridized tester fraction (differentially expressed) from the hybridized common 
sequences. This step has been achieved physically, chemically and through the use 
of selective polymerase chain reaction (PCR) techniques. 

Physical separation 

Original subtractive hybridization technology involved the physical separation 
of hybridized common species from unique single stranded species. Several methods 
of achieving this have, been described, including hydroxyaparite chromatography 
(Sargent and Dawid 1983), avidin-biotin technology (Duguid and Dinauer 1990) 
and oligodT-latex separation (Hara et al. 1991). In the first approach, common 
mRNA species are removed by cDNA (from test cells)-mRNA (from control cells) 
subtractive hybridization followed by hydroxyapatite chromatography, as hydroxy- 
aparite specifically adsorbs the cDNA-mRNA hybrids. The unabsorbed cDNA is 
then used either for the construction of a cDNA library of differentially expressed 
genes (Sargent and Dawid 1983, Schneider eta~L 1988) or directly as a probe to 
screen a preselected library (Zimmerman et aL 1 980, Davis et al. 1 984, Hedrick et al. 
1984). A schematic diagram of the procedure is shown in figure 1. 

Less rigorous physical separation procedures coupled with sensitivity enhancing 
PCR steps were later developed as a means to overcome some of the problems 
encountered with the hydroxyapatite procedure. For example, Daguid and Dinauer 
( 1 990) described a method of subtraction utilizing biotin-affinity systems as a means 
to remove hybridized common sequences. In this process, both the control and 
tester mRNA populations are first converted to cDN A and an adaptor (' oligovector ' , 
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Produce clones Label directly and probe library 

Figure 1. The hydroxyapatite method of subtractive hybridization. cDNA derived from the 
treated/altered (tester) population is mixed with a large excess of mRNA from the control tdnver> 
population. Following hybridi2a1ian. mRNA.cDNA hybrids arc removed by hydroxyapatite 
chromatography. The only cDNAs which remain are those which are dinerentiailv expressed in 
the treated/ altered population. In order to facilitate the recovery of full length clones, small cDNA 
fragments are removed by exclusion chromatography. The remaining cDNAs are then cloned into 
a vector for sequencing, or labelled and used directly to probe a library, as described by Sareent 
and Dawid (1983). 

containing a restriction site) ligated to both sides. Both populations are then 
amplified by PCR, but the driver cDNA population is subsequently digested with 
the adaptor-containing restriction endonuclease. This serves to cleave the oligo- 
vector and reduce the amplification potential of the control population. The digested 
control population is then biotinylated and an excess mixed with tester cDNA. 
Following denaturation and hybridization, the mix is applied to a biocytin column 
(streptavidin may also be used) to remove the "control population, including 
heterodupiexes formed by annealing of common sequences from the tester 
population. The procedure is repeated several times following the addition of fresh 



660 



J. C. Rockett et al. 



Control (driver) mRNA 
AAAA 



Test (tester) mRNA 



-AAAA 
-AAAA 



1 



Anneal mRNA to polydTx latex beads 



AAAA" 

^ cDNA synthesis 



Mix and anneal 



i 



AAAA- 



AAAA 



AAAA- 



Centrifuge beads, collect and store supernatant 
dissociate potyA, reapply supernatant 



AAAA 



AAAA 



Tester-specific mRNA retrieved after 
4 rounds of hybridization 



cONA synthesis 
Ligate adaptors and insert into vector 

_ _ i _ 

Sequence inserts and/or carry out 
other downstream applications 

Figure 2. The use of oligodT M latex to perform subtractive hybridization. mRNA extracted from the 
control (driver) population is converted to anchored cDNA using polydT oligonucleotides 
attached to latex beads. mRNA from the treated/altered (tester) population is repeatedly 
hybridized against an excess of the anchored driver cDNA. The final population of mRNA is 
tester specific and can be-converted into cDNA for cloning and other downstream applications, as 
described by Har* et a/. (199*). 
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control cDNA. In order to further enrich those species differentially expressed in 
the tester cDNA, the subtracted tester population is amplified by PCR following 
every second subtraction cycle. After six cycles of subtraction (three reamplification 
steps) the reaction mix is ligated into a vector for further analysis. 

In a slightly different approach, Hara et aL (1991) utilized a method whereby 
oligo(dT 30 ) primers attached to a latex substrate are used to first capture mRNA 
extracted from the control population. Following 1st strand cDNA synthesis, the 
RNA strand of the heteroduplexes is removed by heat denaturation and centri- 
fugation (the cDNA-oligotex-dT 30 forms a pellet and the supernatant is removed). 
A quantity of tester mRNA is then repeatedly hybridized to the immobilized control 
(driver) cDNA (which is present in 20-fold excess). After several rounds of 
hybridization the only mRNA molecules left in the tester mRNA population are 
those which are not found in the driver cDNA-oligotex-dT 30 population. These 
tester-specific mRNA species are then converted to cDNA and, following the 
addition of adaptor sequences, amplified by PCR. The PCR products are then 
ligated into a vector for further analysis using restriction sites incorporated into the 
PCR primers. A schematic illustration of this subtraction process is shown in figure 
2. 

However, all these methods utilising physical separation have been described as 
inefficient due to the requirement for large starting amounts of mRNA, significant 
loss of material during the separation process and a need for several rounds of 
hybridization. Hence, new methods of differential expression analysis have recently 
been designed to eliminate these problems. 
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Chemical Cross-Linking Subtraction ( CCLS ) 

In this technique, originally described by Hampson et aL (1992), driver mRNA 
is mixed with tester cDNA (1st strand only) in a ratio of > 20:1. The common 
sequences form cDNArmRNA hybrids, leaving the tester specific species as single 
stranded cDNA. Instead of physically separating these hybrids, they are inactivated 
chemically using 2,5 diaziridinyl-l,4-benzoquinone (DZQ). Labelled probes are 
then synthesized from the remaining single stranded cDNA species (unreacted 
mRNA species remaining from the driver are not converted into probe material due 
to specificity of Sequenase T7 DNA polymerase used to make the probe) and used 
tqscreen_a cDNA library made from the tester cell population. A schematic diagram 
of the system is shown iri figure 3. 

It has been shown that the differentially expressed sequences can be enriched at 
least 300-fold with one round of subtraction (Hampson et aL 1992), and that the 
technique should allow isolation of cDNAs derived from transcripts that are present 
at less than 50 copies per cell. This equates to genes at the low end of intermediate 
abundance (see table 1). The main advantages of the CCLS approach are that it is 
rapid, technically simple and also produces fewer false positives than other 
differential expression analysis methods. However, like the physical separation 
protocols, a major drawback with CCLS is the large amount of starting material 
required (at least 10 peg RNA). Consequently, the technique has recently been 
refined so that a renewable source of RNA can be generated. The degenerate random 
oligonucleotide primed (DROP) adaptation (Hampson et aL 1996, Hampson and 
Hampson 1997) uses random hexanucleotide sequences to prime solid phase- 
synthesized cDN A. Since each primer includes a T7 polymerase promotor sequence 
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Figure 3. Chemical cross-linking subtraction. Excess driver mRNA is mixed with 1 st strand tester 
cD&A. The common s eque n ces torm mRNA : cDNA hybrids which are cross linked with 2.5 
diazmdinyi-l .4»bercroquinoTie (DZQ) and me remaining cDNA sequences are dirTerenhailv 
expressed in the tester population. Probes are made from these sequences using Sequenase 2.0 
DNA polymerase, which lacks reverse transcriptase activity, and, therefore, does not react with the 
remaining mRNA molecules from the driver. The labelled probes axe then used to screen a cDNA 

' ~ library for clones of differentially expressed sequences. Adapted from Walter et aL (1996), with 



Table 1 . The abundance of mRNA species and classes in a typical mammalian cell. 



mRNA 
class 


Copies of . 

each 
species/cell 


No. of mRNA Mean ^of 
species in each species 
class in class 


Mean mass 
(ng) of each 
species/pg 
total RNA 


Abundant 


12000 


4 


3.3 


1.65 


Intermediate 


300 


500 


0.08 ' 


0.04 


Rare 


IS 


11000 


0.004 


0.002 



: Modified from Bertioli ei at. (199S). - - -'- 
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at the 5' end, the final pool of random cDNA-fragments is a PCR-renewable cDNA 
population which is representative of the expressed gene pool and can be used to 
synthesize sense RNA for use as driver material. Furthermore, if the final pool of 
random cDNA fragments is reamplified using biotinylated T7 primer and random 
hexamer, the product can be captured with streptavidin beads and the antisense 
strand eluted for use as tester. Since both target and driver can be generated from 
the same DROP product, subtraction can be performed in both directions (i.e. for 
up- and down-regulated species) between two different DROP products. 

Representational Difference Analysis ( RDA ) 

RDA of cDNA (Hubank and Schatz 1994) is an extension of the technique 
originally applied to genomic DNA as a means of identifying differences between 
two complex genomes (Lisitsyn et al. 1993). It is a process of subtraction and 
amplification involving subtractive hybridization of the tester in the presence of 
excess driver. Sequences in the tester that have homologues in the driver are 
rendered unamplifiable, whereas those genes expressed only in the tester retain the 
ability to be amplified by PGR. The procedure is shown schematically in figure 4. 

In essence, the driver and tester mRNA populations are first converted to cDN A 
and amplified by PCR following the ligation of an adaptor. The adaptors are then 
removed from both populations and a new (different) adaptor ligated to the 
amplified tester population only. Driver and tester populations are next melted and 
hybridized together in a ratio of 100: 1. Following hybridization, only tester : tester 
homohybrids have 5' adaptors at each end of the DNA duplex and can, thus, be filled 
in at both 3' ends. Hence, only these molecules are amplified exponentially during 
the subsequent PCR step. Although tester : driver heterohybrids are present, they 
only amplify in a linear fashion, since the strand derived from the driver has no 
adaptor to which the primer can bind. Driver : driver heterohybrids have no 
adaptors and, therefore, are not amplified. Single stranded molecules are digested 
with mung bean nuclease before a further PCR-enrichment of the tester : tester 
homohybrids. The adaptors on the amplified tester population are then replaced and 
the whole process repeated a further two or three times using an increasing excess of 
driver (Hubank and Shatz used a tester: driver ratio of 1:400, 1:80000 and 
1 : 800000 for the second, third and fourth hybridizations, respectively). Different 
adaptors are ligated to the tester between successive rounds of hybridization and 
amplification to prevent the accumulation of PCR products that might interfere with 
subsequent amplifications. The final display is a series of differentially expressed 
gene products easily observable on an ethidium bromide gel. 

The main advantages of RDA are that it offers a reproducible and sensitive 
approach to the analysis of differentially expressed genes. Hubank and Schatz (1 994) 
reported that they were able to isolate genes that were differentially expressed in 
substantially less than 1 % of the cells from which the tester is derived. Perhaps the 
main drawback is that multiple rounds of ligation, hybridization, amplifiation and 
digestion are required. The procedure js, therefore, lengthier than many other 
differential display approaches and provides more opportunity for operator-induced 
error to occur7 Although the generation of false" positives has been noted, this has 
been solved to some degree by O'Neill and Sinclair (1997) through the use of HPLC- 
purified adaptors. These are free of the truncated adaptors which appear to be a 
major source of the false positive bands. A very similar technique to RDA, termed 
linker capture subtraction (LCS) was described by Yang and Sytowski (1996). 
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Figure 4. The representational difference analysis (RDA) technique. Driver and tester cDNA are 
digested with a 4-cutter restriction enzyme such as Dpnll. The 1* set of 12/24 adaptor strands 
(oligonucleotides) are ligated to each other and the digested cDNA products. The 12mer is 
subsequently melted away and the 3 'ends filled in using Taq DNA polymerase. Each cDNA 
population is then amplified using PCR, following which the 1 st set of adaptors is removed with 
Dpnll. A second set of 12/24 adaptor strands is then added to the amplified tester cDNA 
population, after which the tester is hybridized against ~a large excess of driver. The 12mer 
adaptors are melted and the 3' ends filled in as before. -PCR is carried out with primers identical 
to the new 24mer adaptor. Thus, the only hybridization products which are exponentially 
amplified are those which are tester : tester combinations. Following PCR, ssDNA products are 
removed with mung bean nuclease, leaving the 'first difference product 1 . This is digested and a 
third set of 1 2/24 adaptors added before repeating the subtraction process from the hybridization 
stage. The process is repeated to the 3 rd or 4 th difference product, as described by Lisitsyn et al. 
(1993) and Hubank and Schatz (1994). -*■ - - ™ 
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Suppression PCR Subtr active Hybridization I SSH) 

The most recent adaptation of the SH approach to differential expression 
analysis was first described by Diatchenko et al. (1996) and Gurskaya et al. (1996). 
They reported that a 1000-5000 fold enrichment of rare cDNAs (equivalent to 
isolating mRN As present at only a few. copies per cell) can be obtained without the 
need for multiple hybridizations/subtractions. Instead of physical or chemical 
removal of the common sequences, a PCR-based suppression system is used (see 
figure 5). 

In SSH, excess driver cDNA is added torwo portions of the tester cDNA which 
have been ligated with different adaptors. A first round of hybridization serves to 
enrich differentially expressed genes and equalize rare and abundant messages. 
Equalization occurs since reannealing is more rapid for abundant molecules than for 
rarer molecules due to the second order kinetics of hybridization (James and Higgins 
1 985). The two primary hybridization mixes are then mixed together in the presence 
of excess driver and allowed to hybridize further. This step permits the annealing of 
single stranded complementary sequences which did not hybridize in the primary 
hybridization, and in doing so generates templates for PCR amplification. Although 
there are several possible combinations of the single stranded molecules present in 
the secondary hybridization mix, only one particular combination (differentially 
expressed in the tester cDNA composed of complimentary strands having different 
adaptors) can amplify exponentially. 

Having obtained the final differential display, two options are available if cloning 
of cDNAs is desired. One is to transform the whole of the final PCR reaction into 
competent cells. Transformed colonies can then be isolated and their inserts 
characterized by sequencing, restriction analysis or PCR. Alternatively, the final 
PCR products can be resolved on a gel and the individual bands excised, reamplified 
and cloned. The first approach is technically simpler and less time consuming. 
However, ligation/ transformation reactions are known to be biased towards the 
cloning of smaller molecules, and so the final population of clones will probably not 
contain a representative selection of the larger products. In addition, although 
equalization theoretically occurs, observ ations in this laboratory suggest that this is 
by no means perfectly accomplished. Consequently, some gene species are present 
in a higher number than others and this will be represented in the final population 
of clones. Thus, in order to obtain a substantial proportion of those gene species that 
actually demonstrate differential expressiorrin the tester population, the number of 
clones that will have to be screened after this step may be substantial. The second 
approach is initially more time consuming and technically demanding. However, it 
would appear to offer better prospects for_ cloning larger and low abundance gel 
products. In addition, one can incorporated screening step that differentiates 
different products of different sequences but of the same size (HA-staining, see 
later). In this way, a good idea of the final number of clones to be isolated and 
identified can be achieved. 

An alternative (or even complementary) approaches to use the final differential 
display reaction to screen a cDNA library to isolate full length clones for further 
characterization, or a DNA array (see later) to quickly identify known genes. SSH 
has been used in this laboratory to begin characterization of the short-term gene 
expression profiles of enzyme- inducers such as phenobarbital (Rockett et al. 1997) 
and Wy-1 4,643 (Rockett et aL unpublished observations). The isolation of 
differentially expressed genes in this manner enables the construction of a fingerprint 
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Figure 5. PCR-select cDNA subtraction. In the primary Jjybridixation, an excess of driver cONA is 
added to each tester cDNA population. The sampleTare heat denatured and allowed to hybridize 
for between 3 and 8 h. This serves two purposes : (1 ) to equalize rare and abundant molecules ; and 
(2) to enrich for differentially expressed sequences— cDNAa.that are not differentially expressed 
form type c molecules with the driver. In the secondary hybridization, the two primary 
hybridizations are mixed together without denaturing. Fresh denatured driver can also be added 
at this point to allow further enrichment of differentially expressed sequences. Type e molecules 
are formed in this secondary hybridization which are subsequently amplified using two rounds of 
PCR. The final products can be. visualized on an agarose geUabelled directly or cloned into a 
vector for downstream manipulation. As described by Dtatchenko et al. (1996) and Gurskaya 

. ._ ft al. (1996), with permission. . 
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Figure 6. Flow diagram showing method used in this laboratory to isolate and identity clones ot eenes 
which are differentially expressed in rat Liver following short term exposure to the enryme 
inducers, phenobarbital and Wy- 14,643. 

of expressed genes which are unique to each compound and time/dose point. Such 
information could be useful in short-term characterization of the toxic potential of 
new compounds by comparing the gene-expression profiles they elicit with those 
produced by known inducers. Figure 6 shows a flow diagram of the method used to 
isolate, verify and clone differentially expressed genes, and figure 7 shows expression 
profiles obtained from a typical SSH experiment. Subsequent sub-cloning of the 
individual bands, sequencing and gene data base interrogation reveals many genes 
which are either up- or down-regulated by phenobarbital in the rat (tables 2 and 3). 

One of the advantages in using the SSH approach is that no prior knowledge is 
required of which specific genes are up/down- regulated subsequent to xenobiotic 
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Figure 7. SSH display patterns obtained from rat liver following 3-day treatment with WY- 14.643 or 
phenobarbital. mRNA extracted from control and treated livers was used to generate the 
differential displays using the PC R- Select cDNA subtraction kit (Clontech). Lane: 1—1 kb 
ladder ; 2^-genes upregulated following Wy, 1 4-643 treatment ; 3 — genes downregulated following 
Wy, 14-643 treatment; A — genes upregulated following phenobarbital treatment; 5— genes 
downregulated following phenobarbital treatment; 6— lkb ladder. Reproduced from Rockett et 
al. (1997), with permission. 

exposure, and an almost complete complement of genes are obtained. For example, 
the peroxisome proliferator and non-genotoxic hepatocarcinogen Wy, 14,643, up- 
regulates at least 28 genes and down-regulates at least 15 in the rat (a sensitive 
species) and produces 48 up- and 37 down- regulated genes in the guinea pig, a 
resistant species (Rockett, Swales, Esda and Gibson, unpublished observations). 
One of these genes, CD81, was up-regulated in the rat and down- regulated in the 
guinea pig following Wy-14,643 treatment. CD81 (alternatively named TAP A-l) is 
a widely expressed cell surface protein which is involved in a large number of cellular 
processes including adhesion, activation, proliferation and differentiation (Levy et 
al. 1998). Since all of these functions are altered to some extent in the phenomena 
of hepatomegaly and non-genotoxic hrpatocarcinogenesis. it is intriguing, and 
probably mechanistically-relevant, that CD81 expression is differentially regulated 
in a resistant and susceptible species. However, the down-side of this approach is 
that the majority of genes can be sequenced and matched to database sequences, but 
the latter are predominantly expressed sequence tags or genes of completely 
unknown function, thus partially obscuring a realistic overall assessment of the 
critical genes of genuine biological interest. Notwithstanding the lack of complete 
funtional identification of altered gene expression, such gene profiling studies 
essentially provides a /molecular fingerprint* in response to xenobiotic challenge, 
thereby serving as a mechanistically- relevant platform for further detailed 
investigations. 

Differential Display (DD) - ~ 

Originally described as 4 RNA fingerprintin^by,atbitrarily primed PCR * (Liang 
and Pardee 1992) this method is now more commonly referred to as 'differential 
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Table 2. Genes up-regulated in rat liver following 3-day exposure to phenobarbital. 



Band number 








(approximate 


Highest sequence 




size in bp) 


similarity 


FASTA-EMBL gene identification 


5 (1300) 




93.5° 0 


CYP2B1 


7(1000) 




95.1% 


Preproalbumin 








Serum albumin mRNA 






98.3% 


NCI-CGAP-Prl H. sapiens (EST) 


10(850) 




95.7% 


CYP2B1 


11(800) 


Clone 1 


94.9% 


CYP2B1 




Clone 2 


75.3% 


CYP2B2 


12 (750) 




93.8% 


TRPM-2 mRNA 








Sulfated glycoprotein 


15 (600) 




92.9 % 


Preproalbumin 








Serum albumin mRNA 


16(55) 


Clone 1 


95.2% 


CYP2B1 




Clone 2 


93.6% 


Haptoglobulin mRNA partial alpha 


21 (350) 




99.3 % 


18S, 5.8S & 28S rRNa 


Bands 1-4, 6, 9, 


13, 14, and 17-20 are shown to be false positives by dot blot anaylsis and, therefore. 


are not sequenced. Derived from Rockett et at. (1997). It should be noted that the above genes do not 


represent the complete spectrum of genes which are up 


-regulated in rat liver^by phenobarbital, but 


simply represents the genes sequenced and identified to date. 


Table 3. Genes down- regulated in rat liver following 3-day exposure to phenobarbital. 


Band number 








(approximate 


Highest sequence 




size in bp) 


similarity 


FASTA-EMBL gene identification 


1 (1500) 




95.3% 


3-oxoacyl-CoA thiolase 


2 (1200) 




92.3% 


Hcmopoxin mRNA 


3(1000) 




91.7% 


Alpha-2u-globultn mRNA 


7 (700) 


Clone 1 


77.2% 


M. musculus CI inhibitor 




Clone 2 


94.5% 


Electron transfer flavoprotein 




Clone 3 


91.0% 


iVf . musculus Topoisomerase 1 (Topo I ) 


8 (650) 


Clone 1 


86.9% 


Soares 2NbMT .V/. musculus (EST) 




Clone 2 


96:2% 


Alpha- 2u-globulin (s-type) mRNA 


9 (600) 


Clone I 


86.9% 


Soares mouse NML A/, musculus (EST) 




Clone 2 


82.0% 


Soares p3NMF 19.5 A/, musculus (EST) 


10(550) 




73.8% 


Soares mouse NML A/, musculus (EST) 


U (525) 




95.7% 


NCI-CGAP-Prl H. sapiens (EST) 


12 (373) 




100.0% 


Ribosomai protein 


13 (23) 


Clone 1 


97.2% 


Soares mouse embryo NbMEl35 (EST'i 




Clone Z 


100.0% 


Fibrinogen B-beta-cnam 




Clone 3 


100.0 % 


Apolipoprotein E gene 


14(170) 




96.0 % 


Soares p3NMFl9.5 Af. musculus (EST) 


15(140) 




97.3% 


Stratagene mouse testis (EST) 


Others: (300) 




96.7% 


R. norvegicus RASP 1 mRNA 


(275) 




93.1% 


Soares mouse mammary gland (EST) 



EST = Expressed sequence tag. Bands 4-6 were shown to be false positives by dot blot analysis and, 
therefore, were not sequenced. Derived from Rockett et al. (1 997). It should be noted that the above genes 
do not represent the complete spectrum of genes which are down* regulated in rat liver by phenobarbital, 
but simiply represents the genes sequenced and identified to date. 



y primed PC R' (Liang 
rred to as * differential 



display' (DP). In this method, all the mRNA species in the control and treated cell 
populations are amplified in separate reactions using reverse transcriptase-PCR 
(RT-PCR). The products are then run side-by-side on sequencing gels. Those 
bands which are present in one display only, or- which are much more intense in one 
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display compared to the other, are differentially expressed and may be recovered for 
further characterization. One advantage of this system is the speed with which it can 
be carried out — 2 days to obtain a display and as little as a week to make and identify 
clones. 

Two commonly used variations are based on different methods of priming the 
reverse transcription step (figure 8). One is to use an oligo dT with a 2-base * anchor' 
at the 3'-end, e.g. 5' (dT u )CA 3' (Liang and Pardee 1992). Alternatively, an 
arbitrary primer may be used for 1st strand cDNA synthesis (Welsh et al. 1992). 
This variant of RNA fingerprinting has also been called 'RAP' (RNA Arbitrarily 
Primed)-PCR. One advantage of this second approach is that PCR products may be 
derived from anywhere in the RNA, including open reading frames. In addition, it 
can be used for mRNAs that are not polyadenylated, such as many bacterial mRNAs 
(Wong and McClelland 1994). In both cases, following reverse transcription and 
denaturation, second strand cDNA synthesis is carried out with an arbitrary primer 
(arbitrary primers have a single base at each position, as compared to random 
primers, which contain a mixture of all four bases at each position). The resulting 
PCR, thus, produces a series of products which, depending on the system (primer 
length and composition, polymerase and gel system), usually includes 5O-100 
products per primer set (Band and Sager 1989). When a combination of different 
dT-anchors and arbitrary primers are used, almost all mRNA species from a cell can 
be amplified. When the cDNA products from two different populations are analysed 
side by side on a polyacrylamide gel, differences in expression can be identified and 
the appropriate bands recovered for cloning and further analysis. 

Although DD is perhaps the most popular approach used today for identifying 
differentially expressed genes, it does suffer from several perceived disadvantages: 

(1) It may have a strong bias towards high copy number mRNAs (Bertioli et al, 
1995), although this has been disputed (Wan et al. 1996) and the isolation of very 
low abundance genes may be achieved in certain circumstances (Guimeraes et 
al. 1995a). 

(2) The cDNAs obtained often only represent the extreme 3' end of the mRNA 
(often the 3'-untranslated region), although this may not always be the case 
(Guimeraes et aL 1995a). Since the 3' end is often not included in Genbank and 
shows variation between organisms. cDNAs identified by DD cannot ialways be 
matched with their genes, even if they have been identified. 

(3) The pattern of differential expression seen on the display often cannot be 
reproduced on Northern blots, with false positives arising in up to 70 % of cases 
(Sun et at. 1994). Some adaptations have been shown to reduce false positives, 
including the use of two reverse transcriptases (Sung and Denman 1997), 
comparison of uninduced and induced celts over a time course (Burn et aL 1994) 

. and comparison of DDPCR-products from two uninduced and two induced 
lines (Sompayrac et aL 1995)/ The latter authors also reported that the use of 
cytoplasmic RNA rather then total RNA reduces false positives arising from 
nuclear RNA that is hot transported to the cytoplasm. 

Further details of the background, strengths and weaknesses of the DD 
technique "can be obtained" from a reviewT>y McClelland" aL (1996) and from 
articles by Liang et aL (1995) and WanVr aL (1996)7" ~ 
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cDNA can now be amplified by PCR using original primer pair 

Figure 8. Two approaches to differential display (DD) analysis. 1" strand svnthesis can be carried out 
either with a polydT u NN primer (where N = G, C or A) or with an arbitrary primer. The use of 
different combinations of G. C and A to anchor the first strand polydT primer enables the priming 
of the majority of polyadenylated mRNAs. Arbitrary primers may hybridize at none, one or more 
places along the length of the mRNA, allowing I" strand cDNA synthesis to occur at none, one 
or more points in the same gene. In both cases, 2 nd strand synthesis is carried out with an arbitrary 
primer. Since these arbitrary primers for the 2 nd strand may also hybridize to the l w strand cDNA 
in a number of different places, several different 2 nd strand products may be obtained from one 
binding point of the 1" strand primer. Following 2 nd strand synthesis, the original set of primers 
is used to amplify the second strand products, with the result that numerous Rene sequences are 
amplified. 

Restriction endonuclease-facilitated analysis of gene expression 
Serial Analysis of Gene Expression (SAGE) 

A more recent development in the field of differential display is SAGE analysis 
(Velculescu et aL 1995). This method uses a different approach to those discussed so 
far and is based on two principles. Firstly, in more than 95 % of cases, short 
-nucleotide sequences ('tags-*) of- only- nine or 10 base pairs provide sufficient 
information to identify their gene of origin. Secondly, concatonation (linking 
together in a series) of these tags allows sequencing of multiple cDNAs within a 
single clone. Figure 9 shows a schematic representation of the SAGE process. In this 
procedure, double stranded cDNA from the test cells is synthesized with a 
biotinylated polydT primer. Following -digestion with a commonly cutting (4bp 
recognition sequence) restriction enzyme C anchoring enzyme'), the 3' ends of the 
cDN A population are captured with streptavidin beads. The captured population is 
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split into two and different adaptors ligated to the 5' ends of each group. Incorporated 
into the adaptors is a recognition sequence for a type IIS restriction enzyme — one 
which- cuts DNA at a defined distance (< 20 bp) from its recognition sequence. 
Hence, following digestion of each captured cDNA population with the IIS enzyme, 
the adaptors plus a short piece of the captured cDNA are released. The two 
populations are then ligated and the products amplified. The amplified products are 
cleaved with the original anchoring enzyme, religated (concatomers are formed in 
the process) and cloned. The advantage of this system is that hundreds of gene tags 
can be identified by sequencing only a few clones. Furthermore, the number of times 
a given transcript is identified is a quantitative measurement of that gene's 
abundance in the original population, a feature which facilitates identification of 
differentially expressed genes in different cell populations. 

Some disadvantages of SAGE analysis include the technical difficulty of the 
method, a large amount of accurate sequencing is required, biased towards abundant 
mRNAs, has not been validated in the pharmaco/toxicogenomic setting and has 
only been used to examine well known tissue differences to .date. 

Gene Expression Fingerprinting (GEF) 

A different capture/restriction digest approach for isolating differentially 
expressed genes has been described by Ivanova and Belyavsky (1995). In this 
method, RNA is converted to cDNA using biotinylated oligo(dT) primers. The 
cDNA population is then digested with a specific endonuclease and captured with 
magnetic streptavidin microbeads to facilitate removal of the unwanted 5' digestion 
products. The use of restricted 3'-ends alone serves to reduce the complexity of the 
cDNA fragment pool and helps to ensure that each RNA species is represented by 
not more than one restriction product. An adaptor is ligated to facilitate subsequent 
amplification of the captured population. PCR is carried out with one adaptor- 
specific and one biotinylated polydT primer. The reamplified population is 
recaptured and the non-biotinylated strands removed by alkaline dissociation. The 
non-biotinylated strand is then resynthesized using a different adaptor-specific 
primer in the presence of a radiolabeled dNTP. The labelled immobilized 3' cDNA 
ends are next sequentially treated with a series of different restriction endonucleases 
and the products from each digestion analysed by PAGE. The result is a fingerprint 
composed of a number of ladders (equal to the number of sequential digests used). 
By comparing test versus control fingerprints, it is possible to identify differentially 
expressed products which can then be isolated from the gel and cloned. The 
advantages of this procedure are that it is very robust and reproducible, and the 
authors estimate that 80-93 % of cDNA molecules are involved in the final 
fingerprint. The disadvantage is that polyacrylamide gels can rarely resolve more 
than 300-400 bands7 which compares poorly "to~the 1000 or more which are 
estimated to be produced in" an average experiment. The use of 2-D gels such as 
those described by Uitterlinden etaL (1989) and Hatada et al, (1991) may help to 
overcome this problem. 

A similar method for displaying restriction endonuclease fragments was later 
described, by Prashar and^Weissman (1 996)7 Howeve r, instead of sequential 
digestion of the immobolized 3'-tenTunal.cDNA fragments, these authors simply 
compared the profiles oL the * control and -treatetFpopulations without further 
--manipulation • 
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GGATGCATGXXXXXXXXX 
CCTACGTACXXXXXXXXX 



GGATGCATG000000000 
CXTACGTACO0O0OOOO0 



TE AE 



Tag 
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| Ligate and amplify 



GGATGCATGXXXXXXXXX0OOOO0OOOCATGCATCC 
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Cleave wife AE. isolate diTags, 
concatenate, done ana 
sequence 

AE 



— CATGXXXXXXXXX0OOOO0OOOCATG XXXXXXXXXOOOOOOOOOCATG— 
— GTACXXXXXXXXXOOOOOOOOOGTAC XXXXXXXXXOOOOOOOOOGTAC— 

Tag1 Tag2 Tag3 Tag4 

.P."*"*!?.?- Serial analysis of gene expression (SAGE) analysis. cDN A is cleaved with an anchoring enzyme 
(AE) and the 3'ends captured using streptavidin beads] Tfce cDNA pool is divided in half and each 
portion ligated to a different linker; each containing a type IIS restriction site (tagging enzyme, 
TE). Restriction with the type IIS enzyme releases the linker plus a short length of cDNA 
(XXXXX and OOOOO indicate nucleotides of different tags). The two pools of tags axe then 
ligated and amplified using linker-specific primers. Following PGR, the products are cleaved with 

. _ _ the AE.and thcditags isolated from the Imkm.^mg Pi\GE. The di tags a« then ligated (during 
which process, coneatenization occurs) and cloned into a vector of choice for sequencing. After 
Velculescu tt al. (1995), with permission^ ._ , 
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DNA arrays 

'Open' differential display systems are cumbersome in that it takes a great deal 
of time to extract and identify candidate genes and then confirm that they are indeed 
up- or down- regulated in the treated compared to the control tissue. Normally; the 
latter process is carried out using Northern blotting or RT-PCR. Even so, each of 
the aforementioned steps produce a bottleneck to the ultimate goal of rapid analysis 
of gene expression. These problems will likely be addressed by the development of 
so-called DNA arrays (e.g. Gress et aL 1992, Zhao et al. 1995, Schena et aL 1996), 
the introduction of which has signalled the next era in differential gene expression 
analysis. DNA arrays consist of a* gridded membrane or glass 'chips* containing 
hundreds or thousands of DNA spots, each consisting of multiple copies of part of 
a known gene. The genes are often selected based on previously proven involvement 
in oncogenesis, cell cycling, DNA repair, development and other cellular processes. 
They are usually chosen to be as specific as possible for each gene and animal species. 
Human and mouse arrays are already commercially available and a few companies 
will construct a personalized array to order, for example Clontech Laboratories and 
Research Genetics Inc. The technique is rapid in that hundreds or even thousands 
of genes can be spotted on a single array, and that mRNA/cDNA from the test 
populations can be labelled and used directly as probe. When analysed with, 
appropriate hardware and software, arrays offer a rapid and quantitative means to 
assess differences in gene expression between two cell populations. Of course, there 
can only be identification and quantitation of those genes which are in the array 
(hence the term 'closed' system). Therefore, one approach to elucidating the 
molecular mechanisms involved in a particular disease/development system may be 
to combine an open and closed system — a DNA array to directly identify and 
quantitate the expression of known genes in mRNA populations, and an open 
system such as SSH to isolate unknown genes which are differentially expressed. 

One of the main advantages of DNA arrays is the huge number of gene fragments 
which can be put on a membrane— some companies have reported gridding up to 
60000 spots on a single glass 'chip' (microscope slide). These high density chip- 
based micro-arrays will probably become available as mass-produced off-the-shelf 
items in the near future. This should facilitate the more rapid determination of 
differential expression in time and dose-response experiments. Aside from their 
high cost and the technical complexities involved in producing and probing DNA 
arrays, the main problem which remains, especially with the newer micro-array 
(gene-chip) technologies, is that results are often not wholly reproducible between 
arrays. However, this problem is being addressed and should be resolved within the 
next few years. 



EST databases as a means to identify differentially-expressed genes 

Expressed sequence tags (ESTs). are partial sequences of clones obtained from 
cDNA libraries. Even though most ESTs have no formal identity (putative 
identification is the best to be hoped for), they have proven to be a rapid and efficient 
means of discovering new genes and can be- used to generate profiles of gene- 
expressibrf in specific cells. Since they~were first described by Adams et al. (1991), 
there has been a huge explosion in EST production and it is estimated that there are 
now well over a million such sequences in the public domain, representing over half 
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of all human genes (Hillier et al 1996). This large number of freely available 
sequences (both sequence information and clones are normally available royalty-free 
from the originators) has enabled the development of a new approach towards 
differential gene expression analysis as described by Vasmatzis et al. (1998). The 
approach is simple in theory: EST databases are first searched for genes that have a 
number of related EST sequences from the target tissue of choice, but none or few 
from non-target tissue libraries. Programmes to assist in the assembly of such sets of 
overlapping data may be developed in-house or obtained privately or from the 
internet. For example, the Institute for Genomic Research (TIGR, found at 
http://www.tigr.org) provides many software tools free of charge to the scientific 
community. Included amongst these is the TIGR assembler (Sutton et al. 1995), a 
tool for the assembly of large sets of overlapping data such as ESTs, bacterial 
artificial chromosomes (BAC)s, or small genomes. Candidate EST clones repre- 
senting different genes are then analysed using RNA blot methods for size and tissue 
specificity and, if required, used as probes to isolate and identify the full length 
cDNA clone for further characterization. In practice however, the method is rather 
more involved, requiring bioinformatic and computer analysis coupled with 
confirmatory molecular studies. Vasmatzis et al. (1998) have described several 
problems in this fledgling approach, such as separating highly homologous 
sequences derived from different genes and an overemphasis of specificity for some 
EST sequences. However, since these problems will largely be addressed by the 
development of more suitable computer algorithms and an increased completeness 
of the EST database, it is likely that this approach to identifying differentially 
expressed genes may enjoy more patronage in the future. 



Problems and potential of differential expression techniques 

The holistic or single cell approach ? 

When working with in vivo models of differential expression, one of the first 
issues to consider must be the presence of multiple cell types in any given specimen. 
For example, a liver sample is likely to contain not only hepatocytes, but also 
(potentially) Ito cells, bile ductule cells, endothelial cells, various immune cells (e.g. 
lymphocytes, macrophages and Kuprfer cells) and fibroblasts. Other tissues will 
each nave their own distinctive cell populations. Also, in the case oi neoplastic tissue 
there are almost always normal, hyperplastic and/ or dyspiastic cells present in a 
sample. One must, therefore, be aware that genes obtained from a differential 
display experiment performed on an animal tissue model may not necessarily arise 
exclusively from the intended 'target' cells, e.g. hepatocytes/neopiastic cells. If 
appropriate, further analyses using immunohistochemistry, in situ hybridization or 
in situ RT-PCR should be used to confirm which cell types are expressing the 
gene(s) of interest. This problem is probably most acute for those studying the 
"differential expression* of genes in"the"tfevetopmenr of different cell types, where 
there is a need to examine homologous cell populations. The problem is now being 
addressed at the National Cancer Institate (Bethesda, MD, USA) where new micro- 
disection techniques have been employed to assist in their gene analysis programme, 
the Canc e r Genome Anatomy Project (CGAE) {For more information see web site : 
http://www.ncbi.nlm.nih.gov/ncicgap/intro.html). There are also separation tech* 
tuques available that utilise cell-specific ah tigens"as a means to isolate target cells, 
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e.g. fluorescence activated cell sorting (FACS) (Dunbar et al. 1998, Kas-Deelen et 
aL 1998) and magnetic bead technology (Richard et aL 1998, Rogler et aL 1998). 

However, those taking a holistic approach may consider this issue unimportant. 
There is an equally appropriate view that all those genes showing altered expression 
within a compromized tissue should be taken into consideration. After all, since all 
tissues are complex mixes of different, interacting cell rypes which intimatelv 
regulate each other's growth and development, it is clear that each cell tvpe could in 
some way contribute (positively or negatively) towards the molecular mechanisms 
which lie behind responses: to external stimuli or neoplastic growth. It is perhaps 
then more informative to carry out differential display experiments using in vivo as 
opposed to rn vi tro models, where uniform populations of identical cells probablv 
represent a partial, skewed or even inaccurate picture of the molecular changes that 
occur. 

The incidence and possible implications of inter-individual biological variation 
should be considered in any approach where whole animal models are being used. It 
is clear that individuals (humans and animals) respond in different ways to identical 
stimuli. One of the best characterized examples is the debrisoquine oxidation 
polymorphism, which is mediated by cytochrome CYP2D6 and determines the 
pharmacokinetics of many commonly prescribed drugs (Lennard 1993, Meyer and 
Zanger 1997). The reasons for such differences are varied and complex, but allelic 
variations, regulatory region polymorphisms and even physical and mental health 
can all contribute to observed differences in individual responses. Careful thought 
should, therefore, be given to the specific objectives of the study and to the possible 
value of pooling starting material (tissue/mRNA). The effect of this can be 
beneficial through the ironing out of exaggerated responses and unimportant minor 
fluctuations of (mechanistically) irrelevant genes in individual animals, thus 
providing a clearer overall picture of the general molecular mechanisms of the 
response. However, at the same time such minor variations may be of utmost 
importance in deciding the ability of individual animals to succumb to or resist the 
effects of a given chemical/disease. 



How efficient are differential expression techniques at recovering a high percentage of 
differentially expressed genes? 

A number of groups have produced experimental data suggesting that mam- 
malian cells produce between 8000-15000 different mRNA species at anv one time 
(Mechler and Rabbit*; 1981, Hedrick et aL 1984, Bravo 1990), although figures as 
high as 20-30000 have also been quoted (Axel et aL 1976). Hedrick et aL (1984) 
provided evidence suggesting that the majority of these belong to the rare abundance 
class. A breakdown of this abundance distribution is shown in table 1. 

When the results of differenrial^p4ay-experimeats have been compared with 

data obtained previously using other methods, it is apparent that not all differentially 
expressed mRNAs are represented in the final display. In particular, rare messages 
(which, importantly, often include regulatory proteins) are not easily recovered 
using differential display systems. This is a major short comin g, as the majority of 
mRNA species exist at levels of less than TO.O05%"of the" totaTpopulation (table 1). 
Bertioli -€t~aL (1995) examined-the efl6cien<T^)f-BI> templates (heterogeneous 
mRNA populations) for recovering rare messages and were unable to detect mRNA 
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species present at less than 1.2 % of the total mRN A population — equivalent to an 
intermediate or abundant species. Interestingly, when simple model systems (single 
target only) were used instead of a heterogeneous mRNA population, the same 
primers could detect levels of target mRNA down to 10000 x smaller. These results 
are probably best explained by competition for substrates from the many PCR 
products produced in a DD reaction. 

The numbers of differentially expressed mRN As reported in the literature using 
various model systems provides further evidence that many differentially expressed 
mRNAs are not recovered. For example, DeRisi et at. (1997) used DNA arrav 
technology to examine gene expression in yeast following exhaustion of sugar in the 
medium, and found that more than 1700 genes showed a change in expression of at 
least 2-fold. In light of such a finding, it would not be unreasonable to suggest that 
of the 8000-1 5 000 different mRNA species produced by any given mammalian cell, 
up to 1000 or more may show altered expression following chemical stimulation. 
Whilst this may be an extreme figure, it is known that at least 100 genes are 
activated/upregulated in Jurkat (T-) cells following IL-2 stimulation (UUman et al 
1990). In addition, Wan et al (1996) estimated that interferon-y-stimulated HeLa 
cells differentially express up to 433 genes (assuming 24000 distinct mRNAs 
expressed by the cells). However, there have been few publications documenting 
anywhere near the recovery of these numbers. For example, in using DD to compare 
normal and regenerating mouse liver, Bauer et al (1993) found only 70 of 38000 
total bands to be different. Of these, 50% (35 genes) were shown to correspond to 
differentially expressed bands. Chen et al (1996) reported 10 genes upregulated in 
female rat liver following ethinyl estradiol treatment. McKenzie and Drake (1997) 
identified 14 different gene products whose expression was altered by phorbol 
myristate acetate (PMA, a tumour promoter agent) stimulation of a human 
myelomonocytic cell line. Kilty and Vickers (1997) identified 10 different gene 
products whose expression was upregulated in the peripheral blood leukocytes of 
allergic disease sufferers. Linskens et al (1995) found 23 genes differentially 
expressed between young and senescent fibroblasts. Techniques other than DD 
have also provided an apparent paucity of differentially expressed genes. Using SH 
for example, Cao et al (1997) found 15 genes differentially expressed in colorectal 
cancer compared to normal mucosal epithelium. Fitzpatrick et al (1995) isolated 17 
genes upregulated in rat liver following treatment with the peroxisome proliferator. 
clofibrate; Philips et al (1990> isolated 12 cONA clones which were upregulated in 
highly metastatic mammary adenocarcinoma cell lines compared to poorly meta- 
static ones. Prashar and Weissman (1996) used 3' restriction fragment analysis and 
identified approximately 40 genes showing altered expression within 4h of 
activation of Jurkat T-cells. Groenink and Leegwater (1996) analysed 27 gene 
fragments isolated using SSH of delayed early response phase of liver regeneration 
and found only 12 to be upregulated. 

In the laboratory, SSH was used to isolate up to 70 candidate genes which appear 
to show altered expression in guinea pig liver following short-term treatment with 
the peroxisome proliferator, WY- 14,643 (Rockett, Swales, Esdaile and Gibson, 
unpublished observations). However, these findings have still to be confirmed by 
analysis of the extracted tissue mRNA for differential expression of these sequences. 

Whilst the latest differential display technologist/ ar^purported to include design 
and experimental modifications to overcome this lack oLeffi^iency (in both the total 
number of differentially expressed genes recovered and the percentage that are true 
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positives), it is still not clear if such adaptations are practically effective-™™* 
efficiency by spiking with a known amount of limited numbers of artificial 
construct(s) is one thing, but isolating a high percentage of the rare messages alreadv 
present m an mRN A population is another. Of course, some models will genuinely 
produce only a smaU number of differentially expressed genes. In addition, there are 
also technical problems that can reduce efficiency. For example. mRNAs may have 
an unusual primary structure that effectively prevents their amplification bvPCR- 
based systems In addition, it is known that under certain circumstances not all 
mRNAs have 3 polyA sites. For example, during Xenopus development, deadenvl- 
ation is used as a means td stabilize RNAs (Voeltz and Steitz 1998) whilst 
preferential deadenylation may play a role in regulating Hsp70 (and perhaps 
therefore, other stress protein) expression in Drosophila (Dellavalle et al 1 994) The 
presence of deadenylated mRNAs would clearly reduce the efficiency of svstems 
utilizing a polydT reverse transcription step. The efficiency of anv system also 
depends on the quality of the starting material. All differential display techniques 
use mRNA as their mget material. However, it is difficult to isolate mRNA that is 
T P H et n y M ? e ° f n u b ° SOmal RNA - £ ™ * P°lydT primers are used to prime first 
trand cDNA synthesis, ribosomal RNA is often transcribed to some degree 

eai°t nteC t h S t SUbtraCti ° n kh USCr manual) - 11 has ^ho-.: 
least m the case of SSH. that a high rRNA : mRNA ratio can lead to inefficient 

subtractive hybridization (Clontech PCR- Select cDNA Subtraction kit user 
manual) and there ,s no reason to suppose that it will not do likewise in other SH 
approaches Finally, those techniques that utilise a presubtraction amplification step 
(e.g. RDA) may present a skewed representation since some sequences amplify 
better than others. K ■ ' 

Of course, probably the most important consideration is the temporal factor It 
is clear that any given differential display experiment can oniv interrogate a cell at 
one point in time. It may well be that a high percentage of the genes showing altered 
express,on at that time are obtained. However, given that disease processes and 
responses to env,ronmental stimuli involve dynamic cascades of signalling 
regulation, production and action, it is clear that all those genes which are switched 
on/off at dtfferent times will not be recovered and. therefore, vital information may 
well be missed. It is. therefore, imperative to obtain as much information about the 
model system beforehand as possible, from which a strategy can be derived for 
targeting specific rime points or events that are of particular interest to the 
investigator. One way of getting round this problem of single time point analysis is 
to conduct the experiment over a suitable time course which, of course.' adds 
substantially to the amount of work involved. 



How sensitive are differential expression technologies? 

There has been little published data that addresses the issue of how large the 
change in expression must be for it to permit isolation of the gene in question with 
the various differential expression technologies. Although the isolation of genes 
whose expression, is changed as little ,as_L5-fold has been reported using SSH 
(Groenink and Leegwater-1996rir ap^eiiri^rtnose demonstrating a change in 
_excgss_qf 5-fold^e_rnare.JikeJy .,to_ h^nirkcd u p^Thus, there is a 'grey zone' 
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experiments and animals. DD,_on the other hand, is not subject to this grey 
zone since, unlike SH approaches, it does not amplify the difference in expression 
between two samples. Wan et al. (1996) reported that differences in expression of 
twofold or more are detectable using DD. 



Resolution and visualization of differential expression products 

It seems highly improbable with current technology that a gel system could be 
developed that is able to resolve all gene species showing altered expression in anv 
given test system (be it SH- or DD-based). Polyacrylamide gel electrophoresis 
(PAGE) can resolve size differences down to 0.2 ° 0 (Sambrook et aL 1989) and are 
used as standard in DD experiments. Even so, it is clear that a complex series of gene 
products such as those seen in a DD will contain unresolvable components. Thus, 
what appears to be one band in a gel may in fact turn out to be several. Indeed, it has 
been well documented (Mathieu-Daude et aL 1996, Smith et al. 1997) that a single 
band extracted from a DD often represents a composite of heterogeneous products, 
and the same has been found for SSH displays in this laboratory' (Rockett et al. 
1997). One possible solution was offered by Mathieu-Daude et aL (1996), who 
extracted and reamplified candidate bands from a DD display and used single strand 
conformation polymorphism (SSCP) analysis to confirm which components 
represented the truly differentially expressed product. 

Many scientists often try to avoid the use of PAGE where possible because it is 
technically more demanding than agarose gel electrophoresis (AGE). Unfortunately, 
high resolution agarose gels such as Metaphor (FMC, Lichfield, UK) and AquaPor 
HR (National Diagnostics, Hessle, UK), whilst easier to prepare and manipulate 
than PAGE, can only separate DNA sequences which differ in size by around 
1.5-2% (15-20 base pairs for a 1Kb fragment). Thus, SSH, RDA or other such 
products which differ in size by less than this amount are normally not resolvable. 
However, a simple technique does in fact exist for increasing the resolving power of 
AGE— the inclusion of HA-red (10-phenyl neutral red-PEG ligand) or HA-yellow 
(bisbenzamide-PEG ligand) (Hanse Analytik GmbH, Bremen, Germany) in a 
gel separates identical or closely sized products on base content. Specifically, 
HA-red and -yellow selectively bind to GC and AT DNA motifs, respectively 
(Wawer et aL 1995, Hanse Analytik 1997, personal communication). Since both 
HA-stains possess an overall positive charge, they migrate towards the cathode 
when an electric field is applied. This is in direct opposition to DNA, which 
is negatively charged and, therefore, migrates towards the anode. Thus, if two 
DNA clones are identical in size (as perceived on a standard high resolution 
agarose gel), but differ in AT/GC content, inclusion of a HA-dye in the gel 
will effectively retard the migration of one of the sequences compared to the 
other, effectively making it apparently larger and, thus, providing a means of 
differentiating between the two. The use of HA-red has been shown to resolve 
sequences with an AT variation of less than 1 % (Wawer et aL 1995), whilst Hanse 
Analytik have reported that HA staining is so sensitive that in one case it was used 
to distinguish two 567bp sequences which-differed by only a single point mutation 
(Hanse Analytik 1996, personal communication). Therefore, if one wishes to check 
whether all the clones produced from a specific band in a differential display 
-experiment-are derived from the- same g ene s p e c ies, a small-amount of reamplified 
or digested clone can be run on a standard high resolution gel, and a second aliquot 
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Figure 10. Discrimination of clones of identical/nearly identical size using HA-red. Bands of decreasing 
sue (1-5) were extracted from the final display of a suppression subtractive hvbridization 
experiment and cloned. Seven colonies were picked at random from each cloned band and their 
inserts amplified using PCR. The products were run on two gels. (A) a high resolution 2 % agarose 
gel, and (B) a high resolution 2 ° 0 agarose gel containing 1 U/ml HA-red. 'With few exceptions, all 
the clones from each band appear to be the same size (gel A). However, the presence of HA-red 
(gel B), which separates identically-sized DNA fragments based on the percentage of GC within 
the sequence, clearly indicates the presence of different gene species within each band. For 
example, even though all five re-amplified clones of band 1 appear to be the same size, at least four 
different gene species are represented. 

in a similar gel containing one of the HA-stains. The standard gel should indicate 
any gross size differences; whilst the HA-stained gel should separate otherwise 
unresolvable species (on standard AGE) according to their base content. Geisinger 
et al. (1997) reported successful use of this approach for identifying DD-derived 
clones. Figure 10 shows such an experiment carried out in this laboratory on clones 
obtained from a band extracted from an SSH display. 

An alternative approach is to carry out a 2-D analysis of the differential display 
products. In this approach, size-based separation is first earned out in a standard 
agarose gel. The gel slice containing the display is then extracted and incorporated 
in to a HA gel for resolution based on AT/GC content. 

Of course, one should always consider the possibility of there being different 
gene species which are the same size and have the same GC/AT content. However, 
even these species are not unresolvable given some effort— again, one might use 
SSCP, or perhaps a denaturing gradient gel electrophoresis (DGGE) or temperature 
gradient field electrophoresis (TGGE) approach to resolve the contents of a band, 
either directly on the extracted band (Suzuki ** al. 1991) or on the reampiified 
product. 

The requirement of some differential display techniques to visualize large 
numbers of products (e.g. DD and GEF) can also present a problem in that, in terms 
of numbers, the resolution of PAGE rarely excetds 300-^00 bands. One approach to 
overcoming this might be to use^-D^s such^^h^sie^ejscribed by Uitterlinden et 
al. (1989) and Hatada et.aL (1991).- - • — : , — - 



Differential gene expression 



681 



\-red. Bands of decreasing 
sub tractive hybridization 
rach cloned band and their 
ligh resolution 2° 0 agarose 
id. With few exceptions, all 
:r, the presence of HA-red 
e percentage of GC within 
.es within each band. For 
; the same size, at least four 



*d gel should indicate 
d separate otherwise 
se content. Geisinger 
mtifying DD -derived 
3 laboratory on clones 

he differential dispo- 
ned out in a standard 
ned and incorporated 

there being different 
iT content. However, 
again, one might use 
GGE) or temperature 
le contents of a band, 
Dr on the reamplified 

tes to visualize large 
)blem in that, in terms 
inds. One approach to 
bed by Uitterlinden et 



Extraction of differentially expressed bands from a gel can be complex since, in 
some cases (e.g. DD, GEF), the results are visualized by autoradiographic means, 
such that precise overlay of the developed film on the gel must occur if the correct 
band is to be extracted for further analysis. Clearly, a misjudged extraction can 
account for many man-hours lost. This problem, and that of the use of radioisotopes, 
has been addressed by several groups. For example, -Lohmann et al. (1995) 
demonstrated that silver staining can be used directly to visualize DD bands in 
horizontal PAGs. An et al. (1996) avoided the use of radioisotopes by transferring a 
small amount (20-30%) of the DNA from their DD to a nylon membrane, and 
visualizing the bands using chemiluminescent staining before going back to extract 
the remaining DNA from the gel. Chen and Peck (1996) went one step further and 
transferred the entire DD to a nylon membrane. The DNA bands were then 
visualized using a digoxigenin (DIG) system (DIG was attached to the polydT 
primers used in the differential display procedure). Differentially expressed bands 
were cut from the membrane and the DNA eluted by washing with PCR buffer prior 
to reamplification. 

One of the advantages of using techniques such as SSH and RD A is that the final 
display can be run on an agarose gel and the bands visualized with simple ethidium 
bromide staining. Whilst this approach can provide acceptable results, overstating 
with SYBR Green I or SYBR Gold nucleic acid stains (FMC) effectively enhances 
the intensity and sharpness of the bands. This greatly aids in their precise extraction 
and often reveals some faint products that may otherwise be overlooked. Whilst 
differential displays stained with SYBR Green I are better visualized using short 
wavelength UV (254 nm) rather than medium wavelength (306 nm), the shorter 
wavelength is much more DNA damaging. In practice, it takes only a few seconds 
to damage DNA extracted under 254 nm irradiation, effectively preventing 
reamplification and cloning. The best approach is to overstain with SYBR Green I 
and extract bands under a medium wavelength UV transillumination. 



The possible use of 'microfingerprinting' to reduce complexity 

Given the sheer number of gene products and the possible complexity- of each 
band, an alternative approach to rapid characterization may be to use an enhanced 
analysis of a small section of a differential display — a * sub- fingerprint' or * micro- 
fingerprint'. In this case, one could concentrate on those bands which only appear 
in a particular chosen size region. Reducing the fingerprint in this way has at least 
two advantages. One is that it should be possible to use different gel types, 
concentrations and run times tailored exactly to that region. Currently, one might 
run products from 100-3000 + bp on the same gel, which leads to compromize in the 
gel system being used and consequently to suboptimal resolution, both in terms of 
size and numbers, and can lead to problems in the accurate excision of individual' 
bands. Secondly, it may be possible to enhance resolution by using a 2-D analysis 
using a HA-stain, as described earlier. In summary, if a range of gene product sizes 
is carefully chosen to included certain 4 relevant ' genes, the 2-D system standardized, 
and appropriate gene analysis used, it may be possible to develop a method for the 
early and rapid identification of compounds which have similar or widely different 
"cellular effects. If the prognosis for exposure to one or more other chemicals which 
.display a simiiar_prpfile is already .kn own , then one cou ld perhaps predict similar 
effects for any new compounds which show a similar micro-fingerprint. 
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An alternative approach to microfingerprinting is to examine altered expression 
m specific families of genes through careful selection of PCR primers and/or post- 
reaction analysis. Stress genes, growth factors and/or their receptors, cell cycling 
genes, cytochromes P4S0 and regulatory proteins might be considered as candidates 
for analysis in this way. Indeed, some off-the-shelf DNA arravs (e.g. Clontech's 
Atlas cDNA Expression Array series) already anticipated this to some degree by 
grouping together genes involved in different responses e.g. apoptosis, stress, DNA- 
damage response etc. 



Screening 

False positives 

The generation of false positives has been discussed at length amongst the 
differential display community (Liang etal. 1993, 1995, Nishio etal. 1994 Sunefa/ 
1994, Sompayrac et al. 1995). The reason for false positives varies with the 
technique being used. For instance, in RDA, the use of adaptors which have not 
been HPLC punned can lead to the production of false positives through illegitimate 
hgation events (O'Neill and Sinclair 1997), whilst in DD thev can arise through 
PCR artifacts and illegitemate transcription of rRNA. In SH, false positives appear 
^'7'^ lar « e, y from abundant gene species, although some mav arise from 
a Sf>eC,eS Whi ° h d ° not undergo hybridization for technical reasons 

A quick screening of putative differentially expressed clones can be carried out 
using a simple dot blot approach, in which labelled first strand probes svnthesized 
from tester and driver mRNA are hybridized to an arrav of said clones (Hedrick et 
al. 1984, Sakaguchi et al. 1986). Differentially expressed clones will hybridize to 
tester probe, but not driver. The disadvantage of this approach is that rare species 
may not generate detectable hybridization signals. One option for those using SSH 
is to screen the clones using a labelled probe generated from the subtracted cDNA 
from which it was derived, and with a probe made from the reverse subtraction 
reaction (ClonTechniques 1997a). Since the SSH method enriches rare sequences 
it should be possible to confirm the presence of clones representing low abundance 
genes. Despite this quick screening step, there is still the need to go back to the 
original mRNA and confirm the altered expression using a more quantitative 
approach. Although this may be achieved using Northern blots, the sensitivitv is 
poor by today's high standards and one must rely on PCR methods for accurate and 
sensitive determinations (see below). 



Sequence analysis 

The majority of differential display procedures produce final products which are 
between 100 and lOOObp in size. However, this may considerably reduce the size of 
the sequence for analysis of the DNA databases. This in turn leads to a reduced 
confidence in the result— several families of genes have members whose DNA 
-sequences are almosridc mkal cxictur iu it few key bt retfches; e.g. the cytochrome 
P450 gene superfamily (Nelson et a/._1996). Thus, does the clone identified as being 
almost identical to gene X, really come from that gene, or its brother gene X. or its 
as yet undiscovered sister X,? FdTexampIefuiing SSH; P lrt of agene was isolated, 
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which was up-regulated in the liver of rats exposed to Wy- 14.643 and was identified 
by a FASTA search as being transferrin (data not shown). However, transferrin is 
known to be downregulated by hypolipidemic peroxisome proliferators such as Wy- 
14,643 (Hertz et al. 1996), and this was confirmed with subsequent RT-PCR 
analysis. This suggests that the gene sequence isolated may belong to a gene which 
is closely related to transferrin, but is regulated by a different mechanism. 

A further problem associated with SH technology is redundancv. In most cases 
before SH is carried out, the cDNA population must first be simplified bv restriction 
digestion. This is important for. at least two reasons: 

(1) To reduce complexity— long cDNA fragments may form complex networks 
which prevent the formation of appropriate hybrids, especially at the high 
concentrations required for efficient hybridization. 

(2) Cutting the cDNAs into small fragments provides better representation of 
individual genes. This is because genes derived from related but distinct 
members of gene families often have similar coding sequences that may cross- 
hybridize and be eliminated during the subtraction* procedure (Ko 1990). 
Furthermore, different fragments from the same cDNA may differ considerably 
in terms of hybridization and amplification and, thus, mav not efficiently do one 
or the other (Wang and Brown 1991). Thus, some fragments from differential^ 
expressed cDNAs may be eliminated during subtractive hybridization pro- 
cedures. However, other fragments may be enriched and isolated. As a 
consequence of this, some genes will be cut one or more times, giving rise to two 
or more fragments of different sizes. If those same genes are differentially 
expressed, then two or more of the different size fragments mav come through 
as separate bands on the final differential display, increasing the observed 
redundancy and increasing the number of redundant sequencing reactions. 

Sequence comparisons also throw up another important point— at what degree 
of sequence similarity does one accept a result. Is 90 % identitiy between a gene 
derived from your model species and another acceptably close? Is 95% between 
your sequence and one from the same species also acceptable? This problem is 
particularly relevant when the forward and reverse sequence comparisons give 
similar sequences with completely different gene species ! An arbitrarv decision 
seems to be to allocate genes that are derinite (95 °< n and above similarity and then 
group those between 60 and 95 ° 0 as being related or possible homologues. 

Quantitative analysis 

At some point, one must give consideration to the quantitative analysis of the 
candidate genes, either as a means of confirming that they are truly differentially 
expressed, or in order to establish just what the differences are. Northern blot 
analysis is a popular approach as it is relatively easy and quick to perform. However, 
the major drawback with Northern blots is that they are often not sensitive enough 
to detect rare sequences. Since the majority of messages expressed in a cell are of low 
abundance (see table 1 ), this is a major problem. Consequently, RT-PCR may be the 
-method of choice-forconfirminr differenti al C Jipi e &Muu. Although the procedure is 
somewhat more complex than Northern analysis, requiring synthesis of primers and 
optimization of reaction conditions for each gene species, it is now possible to set up 
high throughput PCR systems~usihg mulitcharuiel pipettes, 96 + . well plates and 
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appropriate thermal cycling technology. Whilst quantitative analysis is more 
desirable, being more accurate and without reliance on an internal standard, the 
money and time needed to develop a competitor molecule is often excessive, 
especially when one might be examining tens or even hundreds of gene species. The 
use of semi-quantitative analysis is simpler, although still relatively involved. One 
must first of all choose an internal standard that does not change in the test cells 
compared to the controls. Numerous reference genes have been tried in the past, for 
example interferon-gamma (IFN-7, Frye et aL 1989), ^-actin (Heuval et aL 1994), 
glyceraidehyde-3-phosphate dehydrogenase (GAPDH, Wong et aL 1994), di- 
hydrofolate reductase (DHFR, Mohler and Butler 1991), ^-microglobulin (0-2- 
m, Murphy et aL 1990), hypoxanthine phosphoribosyl transferase (HPRT, Foss et 
aL 1998) and a number of others (ClonTechniques 1997b). Ideally, an 'internal 
standard should not change its level of expression in the cell regardless of cell age, 
stage in the cell cycle or through the effects of external stimuli. However, it has been 
shown on numerous occasions that the levels of most housekeeping genes currently 
used by the research community do in fact change under certain conditions and in 
different tissues (ClonTechniques 1997b). It is imperative, therefore, that pre- 
liminary experiments be carried out on a panel of housekeeping genes to establish 
their suitability for use in the model system. 

Interpretation of quantitative data must also be treated with caution. By 
comparing the lists of genes identified by differential expression one can perhaps 
gain insight into why two different species react in different ways to external stimuli; 
For example, rats and mice appear sensitive to the non-genotoxic effects of a wide 
range of peroxisome proliferators whilst Syrian hamsters and guinea pigs are largely 
resistant (Orton et aL 1984, Rodricks and Turnbull 1987, Lake et aL 1989, 1993, 
Makowska et aL 1992). A simplified approach to resolving the reasori(s) why is to 
compare lists of up- and down- regulated genes in order to identify those which are 
expressed in only one species and, through background knowledge of the effects of 
the said gene, might suggest a mechanism of facilitated non-genotoxic carcinogenesis 
or protection. Of course, the situation is likely to be far more complex. Perhaps if 
there were one key gene protecting guinea pig from non-genotoxic effects and it was 
upregulated 50 times by PPs, the same gene might only be up- regulated five times , 
in the rat. However, since both were noted to be upregulated, the importance of the 
gene may be overlooked. Just to complicate matters, a large change in expression 
does not necessarily mean a biologically important change. For example, what is the 
tme relevance of gene Y which shows a 50- fold increase after a particular treatment, 
and gene Z which shows only a 5-fold increase ? If one examines the literature one 
may find that historically, gene Y has often been shown to be up-regulated 40-60- 
fpld by a nurnber of unrelated stimuli— in HghToFlhis the 50-fold increase would 
appear less significant. However, the literature may show that gene Z has never been ■ 
recorded as having more than doubled in expression— which makes your 5-fold 
increase all the more exciting. Perhaps even more interesting is if that same 5-fold 
increase has only been seen in related neopIastrlTor following treatment with related 
chemicals. _ ' ._ 

"Problems In using the difTerentiar display approach 

Differential display technology originally held promise of an easily obtainable 
* fingerprint • of those genes which are up- or down-regulated in test animals/cells in 
a developmental process or following exposure to given" stimuli. However, it has 
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become clear that the fingerprinting process, whilst still valid, is much too complex 
to be represented by a single technique profile. This is because all differential display 
techniques have common and/or unique technical problems which preclude the 
isolation and identification of all those genes which show changes in expression. 
Furthermore, there are important genetic changes related to disease development 
which differential expression analysis is simply not designed to address. An example 
of this is the presence of small deletions, insertions, or point mutations such as those 
seen in activated oncogenes, tumour suppressor genes and individual poly- 
morphisms. Polymorphic variations, small though they usually are, are often 
regarded as being of paramount importance in explaining why some patients 
respond better than others to certain drug treatments (and, in logical extension, why 
some people are less affected by potentially dangerous xenobiotics/carcinogens than 
others). The identification of such point mutations and naturally occurring 
polymorphisms requires the subsequent application of sequencing, SSCP, DGGE 
or TGGE to the gene of interest. Furthermore, differential display is not designed 
to address issues such as alternatively spliced gene species or whether an increased 
abundance of mRNA is a result of increased transcription or increased mRNA 
stabilitv. 
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Conclusions 

Perhaps the main advantage of open system differential display techniques is that 
they are not limited by extant theories or researcher bias in revealing genes which are 
differentially expressed, since they are designed to amplify all genes which 
demonstrate altered expression. This means that they are useful for the isolation of 
previously unknown genes which may turn out be useful biomarkers of a particular 
state or condition. At least one open system (SAGE) is also quantitative, thus 
eliminating the need to return to the original mRNA and carry out Northern/PCR 
analysis to confirm the result. However, the rapid progress of genome mapping 
projects means that over the next 5-10 years or so, the balance of experimental use 
will switch from open to closed differential display systems, particularly DNA 
arrays. Arrays are easier and faster to prepare and use, provide quantitative data, are 
suitable for high throughput analysis and can be tailored to look at specific signalling 
pathways or families of genes. Identification of all the gene sequences in human and 
common laboratory animals combined with improved DNA array technology, 
means that it will soon no longer be necessary to try to isolate differentially expressed 
genes using the technically more demanding open system approach. Thus, their 
.jnruun advantage (that of identifying unknown genes) will be largely eradicated. It is 
likely, therefore, that their sphere of application will be reduced to analysis of the 
less common laboratory species, since it will be some time yet before the genomes of 
such animals as zebrafish, eletftric eels, gerbils, crayfish and squid, for example, will 
be sequenced. 

Of course, in the end the question will always remain: What is the functional/ 
biological significance of the identified, differentially expressed genes? One 
~ persistent problem is understanding whether differentially expressed genes are a 
cause or consequence of .the altered state. Furthermore, many chemicals, such as 
non-genotoxic carcinogens, are also mitogens and so genes associated with 
replication will also be upregulated but may have little or nothing to do with the 
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carcinogenic effect. Whilst differential display technology cannot hope to answer 
these questions, it does provide a springboard from which identification, regulatory 
and functional studies can be launched. Understanding the molecular mechanism of 
cellular responses is almost impossible without knowing the regulation and function 
of those genes and their condition (e.g. mutated). In an abstract sense, differential 
display can be likened to a still photograph, showing details of a fixed moment in 
time. Consider the Historian who knows the outcome of a battle and the placement 
and condition of the troops before the battle commenced, but is asked to try and 
deduce how the battle progressed and why it ended as it did from a few still 
photographs— an impossible task. In order to understand the battle, the Historian 
must find out the capabilities and motivation of the soldiers and their commanding 
officers, what the orders were and whether they were obeyed. He must examine the 
terrain, the remains of the battle and consider the effects the prevailing weather 
conditions exerted. Likewise, if mechanistic answers are to be forthcoming, the 
scientist must use differential display in combination with other techniques, such as 
knockout technology, the analysis of cell signalling pathways , mutation analysis and 
time and dose response analyses. Although this review has emphasised the 
importance of differential gene profiling, it should not be considered in isolation and 
the full impact of this approach will be strengthened if used in combination with 
functional genomics and proteomics (2-dimensional protein gels from isoelectric 
focusing and subsequent SDS electrophoresis and virtual 2D-maps using capillary 
electrophoresis). Proteomics is attracting much recent attention as many of the 
changes resulting in differential gene expression do not involve changes in mRNA 
levels, as decribed extensively herein, but rather protein-protein, protein-DNA and 
protein phosphorylation events which would require functional genomics or 
proteomic technologies for investigation. 

Despite the limitations of differential display technology, it is clear that many 
potential applications and benefits can be obtained from characterizing the genetic 
changes that occur in a cell during normal and disease development and in response 
to chemical or biological insult. In light of functional data, such profiling will 
provide a 'fingerprint' of each stage of development or response, and in the long 
term should help in the elucidation of specific and sensitive biomarkers for different 
types of chemical/biological exposure and disease states. The potential medical and 
therapeutic benefits of understanding such molecular changes are almost im- 
measurable. Amongst other things, such fingerprints could indicate the familv or 
even-specific type of chemical an individual has been exposed to plus the length 
and/or acuteness of that exposure, thus indicating the most prudent treatment. 
They may also help uncover differences in histologically identical cancers, provide 
diagnostic tests for the earliest stages of neoplasia anctagauy perhaps indicate the 
most efficacious treatment. 

; The Human Genome Project will be completed early in the next century and the 
DNA sequence of all the human genes will be known. The continuing development 
and evolution of differential gene expression technology will ensure that this 
knowledge contributes fully to the understanding of human disease processes. 
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Abstract 



Recent progress in genomics and proteomics technologies has created a unique opportunity to significantly impact 
the pharmaceutical drug development processes. The perception that cells and whole organisms express specific 
inducible responses to stimuli such as drug treatment implies that unique expression patterns, molecular fingerprints 
indicative of a drug's efficacy and potential toxicity are accessible. The integration into state-of-the-art toxicology of 
assays allowing one to profile treatment-related changes in gene expression patterns promises new insichts into 
mechanisms of drug action and toxicity. The benefits will be improved lead selection, and optimized monitoring of 
drug efficacy and safety m pre-clinical and clinical studies based on biologically relevant tissue and surrogate markers 
© 2000 Elsevier Science Ireland Ltd. All rights reserved. 
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1. Introduction 

The majority of drugs act by binding to protein 
targets, most to known proteins representing en- 
zymes, receptors and channels, resulting in effects 
such as enzyme inhibition and impairment of 
signal transduction. The treatment-induced per- 
turbations provoke feedback reactions aiming to 
compensate for the stimulus, which almost always 
are associated with signals to the nucleus, result- 
ing in altered gene expression. Such gene expres- 
sion regulations account for both the 
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pharmacological action and the toxicity of a drug 
and can be visualized by either global mRNA or 
global protein expression profiling. Hence, for 
each individual drug, a characteristic gene regula- 
tion pattern, its molecular fingerprint, exists 
which bears valuable information on its mode of 
action and its mechanism of toxicity. 

Gene expression is a multistep process that 
results in an active protein (Fig. 1). There exist 
numerous regulation systems that exert control at 
and after the transcription and the translation 
step. Genomics, by definition, encompasses the 
quantitative analysis of transcripts at the mRNA 
level, while the aim of proteomics is to quantify 
gene expression further down-stream, creating a 
snapshot of gene regulation closer to ultimate cell 
function control. . 
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2. Global mRNA profiling 

Expression data at the mRNA level can be 
produced using a set of different technologies 
such as DNA microarrays, reverse transcript 
imaging, amplified fragment length polymorphism 
(AFLP), serial analysis of gene expression 
(SAGE) and others. Currently, DNA microarrays 
are very popular and promise a great potential. 
On a typical array, each gene of interest is repre- 
sented either by a long DNA fragment (200-2400 
bp) typically generated by polymerase chain reac- 
tion (PCR) and spotted on a suitable substrate 
using robotics (Schena et al., 1995; Shalon et al. 7 
1996) or by several short oligonucleotides (20-30 
bp) synthesized directly onto a solid support using 
photolabile nucleotide chemistry (Fodor et aL, 
1991; Chee et al., 1996). From control and treated 
tissues, total RNA or mRNA is isolated and 
reverse transcribed in the presence of radioactive 
or fluorescent labeled nucleotides, and the labeled 
probes are then hybridized to the arrays. The 
intensity of the array signal is measured for each 
gene transcript by either autoradiography or laser 
scanning confocal microscopy. The ratio between 
the signals of control and treated samples reflect 
the relative drug-induced change in transcript 
abundance. 
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3. Global protein profiling 

Global quantitative expression analysis at the 
protein level is currently restricted to the use of 
two-dimensional gel electrophoresis. This tech- 
nique combines separation of tissue proteins by 
isoelectric focusing in the first dimension and by 
sodium dodecyl sulfate slab gel electrophoresis- 
based molecular weight separation on the second, 
orthogonal dimension (Anderson et aL 1991). 
The product is a rectangular pattern of protein 
spots that are typically revealed by Coomassie 
Blue, silver, or fluorescent staining (Fis. 2). 
Protein spots are identified by mass spectrometry 
following generation of peptide mass fingerprints 
(Mann et al., 1993) and sequence tags (Wilkins et 
al.. 1996). Similar to the mRNA approach, the 
ratio between the optical density of spots from 
control and treated samples are compared to 
search for treatment-related changes. 

4. Expression data analysis 

Bioinformatics forms a key element required to 
organize, analyze and store expression data from 
either source, the mRNA or the protein level. The 
overall objective, once a mass of high-quality 
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Fig. 1. Production of an active protein is a multistep process in which numerous regulation systems exert control at various stages 
of expression. Molecular fingerprints of drugs can be visualized through expression profiling at the mRNA level (genomics) usmg 
a variety of technologies and at the protein level (proteomics) using two-dimensional gel electrophoresis. 
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Fig. 2. Computerized representation of a Coomassie Blue stained two-dimensional gel electrophoresis pattern of Fischer F344 rat 
liver homogenate. 



quantitative expression data has been collected, is 
to visualize complex patterns of gene expression 
changes, to detect pathways and sets of genes 
tightly correlated with treatment efficacy and toxi- 
city, and to compare the effects of different sets of 
treatment (Anderson et aL 1996). As the drug 
effect database is growing, one may detect similar- 
ities and differences between the molecular finger- 
prints produced by various drugs, information 
that may be crucial to make a decision whether to 
refocus or extend the therapeutic spectrum of a 
drug candidate. 



5. Comparison of global mRNA and protein 
expression profiling 

There are several synergies and overlaps of data 
obtained by mRNA and protein expression analy- 
sis. Low abundant transcripts may not be easily 
quantified at the protein level using standard two- 
dimensional gel electrophoresis analysis and their 
detection may require prefractionation of sam- 
ples. The expression of such genes may be prefer- 
ably quantified at the mRNA level using 
techniques allowing PCR-mediated target amplifi- 
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cation. Tissue biopsy samples typically yield good 
quality of both mRNA and proteins; however, the 
quality of mRNA isolated from body fluids is 
often poor due to the faster degradation of 
mRNA when compared with proteins. RNA sam- 
ples from body fluids such as serum or urine are 
often not very 'meaningful', and secreted proteins 
are likely mere reliable surrogate markers for 
treatment efficacy and safety. Detection of post- 
translational modifications, events often related to 
function or nonfunction of a protein, is restricted 
to protein expression analysis and rarely can be 
predicted by mRNA profiling. Information on 
subcellular localization and translocation of 
proteins has to be acquired at the level of the 
protein in combination with sample prefractiona- 
tion procedures. The growing evidence of a poor 
correlation between mRNA and protein abun- 
dance (Anderson and Seilhamer, 1997) further 
suggests that the two approaches, mRNA and 
protein profiling, are complementary and should 
be applied in parallel. 

6. Expression profiling and drug development 

Understanding the mechanisms of action and 
toxicity, and being able to monitor treatment 
efficacy and safety during trials is crucial for the 
successful development of a drug. Mechanistic 
insights are essential for the interpretation of drug 
effects and enhance the chances of recognizing 
potential species specificities contributing to an 
improved risk profile in humans (Richardson et 
a]., 1993; Steiner et ai.. 1996b; Aicher et ah, 1998). 
The value of expression profiling further increases 
when links between treatment-induced expression 
profiles and specific pharmacological and toxic 
endpoints are established (Anderson et ah, 1991, 
1995, 1996; Steiner et al. 1996a). Changes in gene 
expression are known to precede the manifesta- 
tion of morphological alterations, giving expres- 
sion profiling a great potential for early 
compound screening, enabling one to select drug 
candidates with wide therapeutic windows 
reflected by molecular fingerprints indicative of 
high pharmacological potency and low toxicity 
(Arce et al., 1998). In later phases of drug devel- 



opment, surrogate markers of treatment efficacy 
and toxicity can be applied to optimize the moni- 
toring of pre-clinical and clinical studies (Doherty 
et al., 1998). 



7. Perspectives 

The basic methodology of safety evaluation has 
changed little during the past decades. Toxicity in 
laboratory animals has been evaluated primarily 
by using hematological, clinicai chemistry and 
histological parameters as indicators of organ 
damage. The rapid progress in genomics and pro- 
teomics technologies creates a unique opportunit*. 
to dramatically improve the predictive power o* 
safety assessment and to accelerate the drug devel- 
opment process. Application of gene and protein 
expression profiling promises to improve lead se- 
lection, resulting in the development of drug can- 
didates with higher efficacy and lower toxicity. 
The identification of biologically relevant surro- 
gate markers correlated with treatment efficacy 
and safety bears a great potential to optimize the 
monitoring of pre-clinical and clinical trails. 
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ABSTRACT Microarrays containing 1046 human cDNAs 
of unknown sequence were printed on glass with high-speed 
robotics. These 1.0-cm 2 DNA -chips" were used to quantita- 
tively monitor differential expression of the cognate human 
genes using a highly sensitive two-color hybridization assav 
Array elements that displayed differential expression patterns 
under given experimental conditions were characterized by 
sequencing. The identification of known and novel heat shock 
and phorhol ester-regulated genes in human T cells demon- 
strates the sensitivity of the assay. Parallel gene analysis with 
microarrays provides a rapid and efficient method for large- 
scale human gene discovery. 

Biology has entered the genome era (1). Complete genome 
sequences for all of the model organisms and human will 
probably be available by the year 2003 (2). Torrents of human 
expressed sequence tags (ESTs) provide a starting point for 
elucidating the function of tens of thousands of cognate genes 
(3). Genome analysis will provide insights into growth, devel- 
opment, differentiation, homeostasis, aging, and the onset of 
diseases (1-3). A derailed understanding of the human genome 
will require the implementation of sophisticated methods for 
gene expression analysis and gene discovery. 

Recently, a microarray-based method for high-throuehput 
monitoring of plant gene expression was described (4)!* This 
"chip' -based approach involved using microarravs of cDNA 
clones as gene-specific hybridization targets to quantitatively 
measure expression of the corresponding plant genes (4 5) A 
two-color fluorescence labeling and detection scheme fecili- 
tated sensitive differential expression analysis of different 
plant tissues (4, 5). The efficiency of this approach for studies 
m higher plants suggested the use of this method for human 
genome analysis (4-7). Here, we report the use of cDNA 
microarrays for human gene expression monitoring, biological 
investigation, and gene discovery. 

MATERIALS AND METHODS 
Human cDNA Clones. The cDNA library was made with 
mRNA from human peripheral blood Ivmphocvtes trans- 
formed with the Epstein-Barr virus. Inserts >600 bp were 
cloned into the lambda vector AYES-R to generate 10 ? -10* 
recombinants. Bacterial transformants were obtained bv in- 
fecting £. coli strain JM107/AKC. Colonies were picked at 
random and propagated in a 96-weII format, and minilvsate 
DNA was prepared by alkaline Ivsis using REAL preps 
(Qiagen, Chatsworth. CA). Inserts were amplified bv PCR in 
a 96 * wc , 11 f orm ai using primers (PAN132 5 : -CCTC- 
TATACTTTA ACGTC A AGG ; and PAN133. 5'-TTGTGTG- 
GAATTGTGAGQGG) complementary to the AYES 
polylmker and containing a six-carbon amino modification 

The publication costs of this article were defrayed in part bv page charge 
payment. This article must therefore be hereby marked "oc/i 'mtsemtnt" in 
accordance with 18 U.S.C. 51734 solelv to mdicate this fact. 



(Glen Research, Sterling. VA) on the 5* end. PCR products 
were purified in a 96-well format using OlAquick columns 
(Uiagen). 

Microarray Preparation. Amino-modified PCR products 
were suspended at a concentration of 0.5 mc/ml in 3x 
standard saline citrate (SSC) and arraved from 96-well micro- 
titer plates onto silyiated microscope slides (CEL Associates 
Houston) using high-speed robotics (4-7). A total of 1056 
cDNAs, representing 1046 human clones and 10 Arabidopsis 
controls, were arrayed in 1.0-cm : areas. Printed arravs were 
incubated for 4 hr in a humid chamber to allow rehydration of 
the array elements and rinsed, once in 0.2^ SDS* for 1 min 
twice in H.O for 1 min. and once for 5 min in sodium 
borohydnde solution (1.0 g of NaBH4 dissolved in 300 ml of 
PBS and 100 ml of 1009* ethanol ). The arravs were submerged 
in H : 0 for 2 min at 95°C, transferred quickly into 0.2% SDS 

at r 25*c n * rinSCd tWiCC ^ H: °' dried " Sl0rCd M lhC dark 

^^" 0< Probes - Tissuc mRNAs were purchased 
(CLONTECH). Jurkat mRNA was isolated as described by 
Schena tt ai (4). Probes were made as described (4) with 
several modifications. The reverse transcriptase used here was 
Superscript II RNase H- (GIBCO). The Cy5-dCTP was 
purchased from Amersham. Each reverse transcription reac- 
tion contained 3.0 M g of total human mRNA* Arabidopsis 
control mRNAs were made by in vitro transcription of cloned 
HAT4, HAT2Z and YesAt-23 cDNAs (4. 8. 9) usinc an RNA 
Transcription Kit (Stratagene). For quantitation, the mRNAs 
W< !^^ pcd ,mo thc rcvcrsc ^anscription reaction at ratios of 
1:100.000, 1:10.000. and 1: 1000 (wt/wt) respectively. Following 
the reverse transcription step, samples were treated with *> 5 n\ 
of 1 M sodium hydroxide for 10 min at 37°C, then neutralized 
by adding 2.5 M t of 1 M Tris HCI (pH 6.8) and 2.0 M l of 1 M 
HC1. Probe mixtures contained cDNA products derived from 
3 fig of total mRNA. suspended in 5.0 M l of hvbridizaiion 
buffer (5x SSC plus 0.27c SDS). 

Hybridization and Scanning. Probes were hybridized to 
1.0-cm- microarrays under a 14 x 14 mm glass coverslip for 
6-12 hr at 60°C in a custom-built hybridization chamber (4-7). 
Arrays were washed for 5 min at room temperature (25°C) in 
low stringency wash buffer (lx SSC/0.2% SDS), then for 10 
min at room temperature in high stringency wash buffer (0.1 X 
SSC/0.2% SDS). Arravs were scanned in 0.1 x SSC using a 
fluorescence laser scanning device (4-7). fitted with a custom 
filter set (Chroma Technology, Brattleboro. VT). Accurate 
differential expression measurements (i.e.. final fluorescence 
ratios) were obtained by taking the average of the ratios of two 
independent hybridizations. 

Abbreviation: EST. expressed sequence tag. 

Data deposition: The sequences reported in this paper have been 
deposited in the GenBankdata base (accession nos. U56654-U56660) 
° wnom reprint requests should be addressed, e-mail- schenafi' 
cmgm.stanford.edu. 
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Cell Culture. Jurkat cells were grown in a tissue culture 
incubator (3TC and 5<5 CO : ) in RPMI medium supplemented 
with lOTc fetal bovine serum. 100 ng of streptomycin per ml. 
and 500 units of penicillin per ml. Heat shock corresponded to 
a 4-hr incubation at 43 C C. Phorbol ester treated cells were 
grown for 4 hr in the presence of 50 ng of phorbol 12-myristate 
13-acetate (PMA) per ml 

RNA Blotting. Dot blots were performed as described (4). 

DNA Sequencing. Sequences were obtained using the 
PAN132 and PAN133 primers and a 373A automated se- 
quencer, according to the instructions of the manufacturer 
(Applied Biosystems). 

Computer Graphics and Informatics. Pseudocolor represen- 
tations of fluorescent images were made with National Institutes 
of Health image software (version 1.52). Software for differential 
expression representations was purchased from Imaging Re- 
search (St. Catherine s, ON. Canada). Sequence searches were 
made to the nonredundant nucleotide data base at the National 
Center for Biotechnology Information (NCBI) using Macintosh 
blast software. The ESf data base was accessed via the World 
Wide Web (http:/www.ncbi.nlm.nirLgov/). 

RESULTS 

Gene Discovery and the Hut Shock Response. Microarrays 
were used to examine the heat shock response in cultured 
human T (Jurkat) cells. Control (37°C) and heat-treated 
(43°C) cells were harvested and lysed. and total mRNA from 
the two cell samples was labeled by reverse transcriptase 
incorporation of fluorescein- and Cy5-dCTP. respectively. In 
a second set of labeling reactions, the fluorescent groups were 
"swapped" such that samples from control and heat-treated 
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samples were labeled with Cy5- and fluoresce in -dCTP. respec- 
tively. Each pair of fluorescent probes was hvbridized to a 
1056-clement micToarray. The arrays were washed at high 
stringency and scanned with a confocal laser scanning device 
to detect emission of the two fluorescent groups. 

Hybridization signals were observed to >95<> of the human 
cDNA array elements, but not to anv of the Ambidopsis 
negative controls (Fig. 1). Fluorescence intensities spanned 
more than three orders of maenitude for the 1046 arrav 
elements surveyed ( Fig. 1 ). Comparative expression analvsis of 
heat shocked versus control cells in the two experiments 
revealed 17 array elements that dispiaved altered fluorescence 
ratios of *2.0-fold (Figs, 1 and ikl Of the 17 putative 
differentially expressed genes. 11 were induced bv heat shock 
treatment and 6 displayed modest repression (Figs. 1 and 2.-1) 

To determine the identity of the heat-reculated genes 
cDNAs corresponding to each of the 17 arrav elements were 
sequenced on the proximal and distal end. Data base searches 
revealed perfect matches for 14 of the 17 clones, and in each 
case proximal and distal cDNA sequences mapped to the same 
gene (Table 1). Of the 1046 human genes examined on the 
micToarray. the five most highly induced in heat-treated cells 
were heat shock protein 90q (hsp90a), dnaJ. hsp9O0, polyu- 
biqumn, and t-complex polypeptide- 1 (tcp-1) (Table 1). Three 
of the 17 clones did not match anv emrv in the public data base 
though one of the clones (B7) exhibited significant homology 
to an EST from Caenorhabditis elegans (Table 1 ). Each of the 
novel sequences (B7-B9) exhibited -2-fold induction (Table 1 ) 
and relatively low-level expression (Table 2). 

To confirm the microarray results, mRNA levels for each of 
the genes were measured by RNA blotting. Each of the genes 
that displayed heat shock induction, including the three novel 



♦Heat Shock 




1 10 100 

Expression Level (per 100,000) 



Thearrav con\ T^in"! " P ' e " ion moni,or f d on a "f™™*- Fluorescent scans represented m a pseudocolor scale correspond toe.pression levels 
prVpareVr^ 

S""^:- ,Wh " e b ° XeS) e T 5P ° nded '° < eMS «*— <" d ^or^ssX^^ 
0 "left and SZS SPST'™? "'"I ^ ^'"".'Jf ' , (W,/ "° oMnW ^' con "°' <"RNAs added to the Ubcbn, reaction. HbSJ^ 
(at lett) and columns (at the top) are demarcated at 10 element increments (white circles). (Bar « 1 mm.) """oarray rows 



10616 Biochemistry: Schena et al 

-/+ Heat Shock 



Proc. Sati Acad. Sci USA 93 (1996) 

/+ Phorbol Ester 




Expression Ratios 

cr£«« " iP e S enlal d - isp , la r.°i ac! ! valc , d and t re P r «$cd genes. Fluorescence ratios of two-color microarrav scans (Fit 1) are depicted 
schemat.c a lly. Fluorcsccm-labeled probes from Jurkat cells subjected to (A) heat shock or (B) phorbo! ester treatment were compared wUh 
Cy.vlabeled probes Irom untreated cells. In a second set of react.oru. the fluorescent eroups were swapped (see text) The data nmc^fte neme 
of the rat,os from two hybndizations. excluding values in which the difference of the tw 0 P ratios teMhan haU *£w!^o^35 

bar corresponds to express.on ratios, which are independent of the absolute expression level of a Riven gene 



Table I. MicroaiTay elements corresponding to differentially expressed genes 



Clone 



Row 



Column 



Ratio 



Blast identity 



Bl 

b: 

B3 

B4 

B5 

B6 

B7- 

BS 

B9 

BIO 

Bll 

bi: 

B13 
B14 
B15 
B16 
B17 
B18 
B19 
B20 
B21 

b;: 

B23 



Accession no. 



24 
1 
15 
32 
17 



14 

7 
12 
28 
14 
20 
30 
10 
13 

7 
21 

3 

1 



21 
31 

8 
19 

8 
31 

4 

19 
5 
8 



7 
9 
12 



16 
19 
30 
26 
IS 
30 
16 



0.5 

0.5 

0.5 

0.5 

0.5 

0.5 

2.0 

2.0 
"» ^ 

2.4 

2.4 

2.5 

2.5 

2.6 

4.0 

5.8 

6.3 

2.0 

2.1 
-» ^ 

2.6 
3.5 
19 



CYC oxidase III 
0-Acttn 

CYC oxidase III 

CYC oxidase III 

CYC oxidase III 

fJ-Actin 

Novel T 

Novel 1 

Novel* 

Polvubiquitin 

TCP-1 

Polvubiquitin 

Polyuhiqumn 

HSP9O0 

DnaJ homoloe 

HSP90a 

HSP9(lo 

(^-microglobulin 
Novel* 

Prmicroelobuhn 
PGK 
NF-«B1 
PAC-l 



J01415. J01415 
NR. X00351 * 
J01415. J01415 
J01415. J0M15 
J01415. J01415 
NR. X00351 
U56653. U56654 
U56655. L'56656 
U56657. U56658 
X04803. X04K03 
X52882. .X52882 
Ml 7597. Ml 7597 
XO4H03, XO4803 
Ml 6660. Ml 6660 
D1338K. D133H8 
X07270, X07270 
M27024. X15183 
S54761. M30683 
L'56659, U56660 
S54761. M30663 
M 11968. L00160 
Z47744. M55643 
LI 1329. LI 1329 



Clone name, array position (Fig. 1). fluorescence ratio, sequence identitv. and accssion number of cDNAs that manitested 
a differential expression pattern with probes prepared from heat shock- (Bl-17) or phorbol cstcr-trcaied (B18-23) Jurkai cells 
Clones sho^ng >989j identity over 300 nucleotides were assumed to be identical to known sequences. All eencs arc nuclear 
except CYC oxidase III {mitochondrial). Access.on numbers reflect the highest score for proximal and distal sequence traces 
respectively. CYC. cytochrome c: TCP-1. T-compi« polypeptide; HSP. heat shock protein: PGK. phosphoelvceratc kinase' 
NF-«B, nuclear factor-kappaB: PAC-l. phosphatase of activated cells: and NR. trace not readable due to* the presence of 
poly(A)+ tract. r 
*B7 is 67Cr identical to an EST from C elegans (D76026). 
*No match in the public data bases. 
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Clone 


oiasi loentitv 


Bl 
B2 


CYC oxidase III 
0-Actin 


B3 


CYC oxidase III 


B4 


CYC oxidase III 


B5 


CYC oxidase III 


B6 


0*Actin 


B7 


Novel (weakly to D76026) 


B8 


Novel 


B9 


Novel 


BIO 


Polyubiquitin 


Bll 


TCP-1 


B12 


Polyubiquitin 


B13 


Polyubiquitin 


B14 


HSP903 


B15 


DnaJ homolog 


B16 


HSP90o 


B17 


HSP90o 


B18 


^microglobulin 


BJ9 


Novel 


B20 


/^microglobulin 


B21 


Phosphogrycerate kinase 


b:: 


NF-KB1 


B23 


PAC-1 



Expression level per 10-* mRNAs 



Microarrav 


Ratio 


RNA bin! 


92/46 




100/80 


240/120 


n < 


270/280 


36/18 




ND 


76/38 




ND 




0,5 


ND 


inn/so 


0.5 


ND 


| 

1J/..0 


2.0 


0.77/1.8 




2.0 


1.5/3.4 


ft fi / 1 fi 


— 


1.2/1.8 


AS/10 

v.o/ i.y 


2.4 


25/89 


23/5.5 


2.4 


7.1/27 


0.8/2.0 


2J 


ND 


1.7/4.3 


15 


ND 


75/200 


2.6 


30/120 


1.0/4.0 


4.0 


1.6/13 


0.6/3.5 


5.8 


3.2/29 


0.8/5.0 


63 


8.6/62 


1.0/2.0 


2.0 


5.4/15 


12/ZS 


2.1 


4.5/9.5 


2.7/5.9 


2.2 


ND 


2.4/6.2 


2.6 


4.7/9.2 


1.7/6.0 


3.5 


0.65/4.7 


0i/9.5 


19 


0.21/15 



Ratio 



0.8 

1.0 

ND 

ND 

ND 

ND 

2.3 

23 

1.5 

3.6 

3.8 

ND 

ND 

4.0 

8.1 

9.1 

7.2 

2.8 

15 

ND 

2.0 

7.2 

71 



or RNA b.o. Ratios correspond * = ^ffl^ffi nT"^?'" l) 



/x q nc <\ cxh,biled elcv «cd mRNA levels bv dot bloi analysis 
(I able 2). In all cases, expression ratios as determined by the 
two procedures differed by <2-fold for the genes identified in 
the heat shock experiments (Table 2). The two assavs differed 
more widely in terms of assessing absolute expression levels- 
nonetheless, absolute expression as monitored on a microarrav 
typically correlated with RNA blots to within a factor of five 
(Table 2). 

Phorbol Ester Signaling To explore a signaling pathway 
distinct from the heat shock response, microarravs were used 
to examine the cellular effects of phorbol ester treatment 
Jurkat cells were treated with phorbol ester, harvested lysed 
and used as a source of mRNA. Samples of mRNA from 
untreated or phorbol ester-stimulated cells were labeled with 
reverse transcriptase. The probes were mixed, hybridized to 
microarravs. and scanned for fluorescence emission of the two 

i^T^T , gr ° UpS - A tolal of sU arrav elements displayed 
S-.O-fold elevated signals with probes from phorbol ester- 
treated cells relative to control samples (Fig. 2B) 

To determine the identity of the phorbol ester-induced 
genes, clones corresponding to the six arrav elements were 
sequenced. Data base searches revealed perfect matches for 
five of the six sequences (Table 1). The two most highly 
induced genes were the PAC-I tyrosine phosphatase and 
nuclear factor-kappa Bl (NF-kBJ): modest activation was 
~ Pho^dycente kinase and ^-microglobulin 
(Table 1) One remaining clone (B19) did not match anv entrv 
m the public data base (Table 1). B19 displayed a 2. 1-fold 
induction and, similar to the novel heat shock "genes, a rela- 
tively low absolute expression level (Tables 1 and 2). All six of 
the phorbol ester-inducible genes displayed increased steady- 
state mRNA levels tjy RNA blotting (Table 2). PAC] expres- 
sion (Fig. l; Table 2) defined a detection limit of -1:500,000 
tor the assay. 

^T^ nSCript lma * i0 * in Hum ™ Tissues. To determine 
whether microarravs could be used to monitor expression in 
numan tissues, probes were prepared from human bone mar- 



rvT^p * 3nd hcan by labclin * cach mRNA Mm P^ 
with Cy5-dCTP. In a separate reaction, a control probe was 

prepared by labeling Jurkat mRNA with fluorescein-dCTP 
The Jour Cy5-labeled probes were each mixed with an aliquot 
of the fluorescein-labeled control sample, and the four mix- 
tures were hybridized to separate microarravs. The arrays were 
washed and scanned for fluorescence emission, and hybrid- 
ization signals for each of the tissues samples weft normalized 

, S i U A« al ?° mr01 10 e cneratc an expression profile for each 
ol the 1046 clones present on the array. 

Delectable expression was observed for all 15 of the heat 

shock and phorbol ester-regulated genes in the four, tissue 

types examined (Fig. 3). In general, the expression level of each 

gene in Jurkat cells correlated rather closelv with expression in 

the four tissues (Table 2; Fig. 3). Genes encoding 0-actin and 

cytochrome c oxidase, the two most hichiv expressed of the 15 

genes in Jurkat ceils (Table 2). were highly expressed in bone 

marrow, brain, prostate, and heart <Fig.*l4.). Expression of 

cytochrome c oxidase. hsp90o, and the novel B7 sequence was ' 

significantly greater in heart than in the other tissues (Fig. 3). 

DISCUSSION 

Many of the heat shock genes identified in this study encode 
factors that function either as molecular * chaoerones" 
(HSP90q. HSP903. DnaJ. TCP- 1 ) or as mediator! ! o7p~ fa 
degradation (polyubiquitin). The identification of these se- 
quences is consistent with the biochemical basis of heat shock 
induction (10-15). Proteins undergo denaturaiion at elevated 
temperatures, and those that fail to maintain proper confor- 
mation must be selectively degraded (10-15). It will be inter- 
esting to determine whether the three novel heat shock- 
mducible sequences (B7-B9) mediate protein folding and 
turnover or possess some other biochemical activity. Complete 
nucleotide sequence determination, conceptual translation 
expression monitoring, and biochemical analysis should pro- 
vide a detailed functional understanding of these genes. 
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Fig. 3. Transcript profiles of heat shock and phorbol ester* 
regulated genes. Gene expression levels per 100.000 mRNAs U-axes) 
are shown for 15 genes (Table 1) in human bone marrow (red), brain 
(green), prostate (blue), and heart (yellow). Genes are grouped 
according to expression levels (A~C). 



Phorbol ester, a potent activator of protein kinase C ( 16. 17). 
induced a set of genes distinct from those involved in the heat 
shock pathway. The most highly induced gene identified in this 
study. PAC-L encodes a nuclear tyrosine kinase that may play 
a role in regulating transcription and cell cycle progression 
(18). NF-icBl. a second phorbol ester-inducible gene, is an 
intensively studied member of the Rel transcription factor 
family (19-21). The Rel proteins are activated by a large 
number of stimuli, including phorbol esters, cytokines, bacte- 
rial and viral pathogens, and ultraviolet light (19-21). Modest 
activation was observed for three sequences not known to be 
inducible by phorbol esters, including phosphoglycerate ki- 
nase, /^-microglobulin, and a novel human gene (B19). Ex- 
tensive expression monitoring with microarrays should assist in 
understanding how each of these genes integrate into the 
highly complex phorbol ester signaling pathway. 

It is striking that four novel human genes were discovered 
with an array of 1000 randomly chosen clones, particularly 
because the heat shock and phorbol ester signaling pathways 
have been so intensively studied (10-21). The facile discovery 
of these sequences underscores the fact* that microarrays can 
be used for gene discovery in the absence of any sequence 
information. By this approach, clones are chosen at random 
from any library of interest and only those clones that display 
interesting expression patterns are sequenced and character- 
ized. This parallel assay, coupled with a modest DNA sequenc- 
ing facility, allows high-throughput human genome expression 
analysis and gene discovery. 

Genes that are activated or repressed by a given stimulus 
provide functional clues to the cellular pathway involved 
(22-24). Detailed examination of these gene expression "sig- 
natures" can provide a dynamic view of the mode of action of 
a given signaling substance (22-24). Microarrays may thus 
allow rapid mechanistic examination of hormones, drugs, 
elicitors, and other small molecules; moreover, functional 
analysis of transcription factors, kinases, growth factors, cyto- 
kines, receptors, and other gene products should be possible. 
Efforts are underway to develop mRNA amplification strate- 
gies to enable probe preparation from minute tissue samples. 
This capability might allow for high-throughput patient screen- 
ing in a clinical setting. 

The current detection limit of the assay allows monitoring of 
transcripts that represent » 1:500.000 (wi/wt) of the total 
mRNA. This 10-fold increase in sensitivity compared with the 
original report (4) was achieved largely by modifying the 
coupling chemistry, which reduced background fluorescence. 
The significance of this improvement is considerable in that 
approximately half the human genes identified in this study, 
including all four novel sequences, exhibited expression levels 
below the original detection limit of 1:50.000 (4). 

The ability to detect 2-fold changes in expression was 
achieved by the use of two-color fluorescence in the labeling 
and detection schemes, digitized data collection, and custom 
software. The importance of this capability is underscored by 
the fact that nearly all of the genes examined here exhibited 
<6-fold changes in expression. The four novel genes, which 
showed £2.2-fold activation, were probably overlooked in 
previous screens that used conventional differential expression 
techniques. It may be possible to further improve the precision 
of the microarray assay by the use of closely related fluorescent 
analogs, such as Cy3 and Cy5. in the labeling and hybridization 
reactions. 

Microarrays offer a number of advantages over other po- 
tential high-capacity approaches to expression analysis. The 
chip-based approach enables small hybridization volumes, high 
array densities, and the use of fluorescence labeling and 
detection schemes. These features provide a set of perfor- 
mance specifications that are unattainable with filter-based 
approaches (25, 26). The use of cDNA clones provides hy- 
bridization specificity that is not readily attained with oligo- 
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nucleotide arrays (27-30). The parallel formal of the assav 
provides a simultaneous differential expression readout for 
>1000 genes. This contrasts with sequencine-based methods 
which require serial data collection for expression analysis (31* 
32). A commercial source of cDNA microarravs would greatly 
speed the use of a chip-based approach to expression analysis 
The availability of large numbers of ESTs (3) provides a rich 
resource of human cDNA clones for microarraving The 
>400.000 ESTs in the public data bases represent a significant 
subset of all human genes (3. 33). Microarravs of thousands of 
ESTs will provide a powerful analytical tool' for future human 
gene expression studies. The •100.000 genes in the human 
genome (2. 33) emphasize the need for microarravs of greater 
density. Attempts to improve microdeposition techniques are 
underway and should allow construction of arravs containing 
a complete set of human gene targets (hrm://cmgm*ianforcl 
edu/-schena/). Microarravs of -100,000 cDNA elements 
would allow expression monitoring of the entire human ee- 
nome m a single hybridization. This capacity, coupled with 
detailed biochemical analysis of the individual gene products 
would greatly speed the functional analysis of the human 
genome. 

We thank S. Elledge (sellcdge@bcm.tmc.edu) for the human cDNA 
library. O.agen representatives for help with plasmid purification, and 
y- 1 bm J th and colleagues at the Protein and Nucleic Acid (PAN) 
lacilrty (Stanford) for oligonucleotide synthesis and DNA sequencine 
We also thank members of the Davis. Brown, and Smith laboratories' 
for critical comments and helpful discussions and Svmeni employees 
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create tne ste23±\VRA3 mutation, polymerase cnan 
reactcn {PGR) pnrners (5 '•TCGGAAGACCTCAT- 
TCTTGCTCATTTTGATATTGCTC- tgtagattg- 
TACTGAGAGTGCAC-3*; and 5'-GCTACAAACAGC- 
GTCGACT TGAATGCCCCGACATCTTCGACTGT- 
GCGCrTATTTCACACCG-3') were used to amplify 
the URA3 sequence of pRS316. and tne reaction 
rjrooxciwas trans i o miedroyeastforone'SteoQW 
reoiacement [R. Rothsten. Methods Bvymol. 194, 
281 (19911]. Tocjeatethea^^lBjCnriuia^corv 
taned on pi 14. a 5.0-kb Sal I fragment from pAXL 1 
was aoned into DUC19. and an internal 4.046 Hpa 
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To construct tne ste23L::LEU2 aflete (a detetcn cor- 
respondng to 931 amno aods) cameo on plS3. a 
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*-€d1 36 U fragment of S7E23. which occurs within a 
6.2-kt) Hind lll-Bgl U genomic fragment earned on 
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SY2625 included the following strains: Y199 
(SY2625 made MA7a). Y278 lSte22-7). Y195 
(mlait::L£U2). Y196 (arf)A:.t£U2). and Y197 
(ax/7.:URA3). The EG123 {MAT* Ieu2 urs3 trpl can1 
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strains for analysts of bud sne selection. EG 123 de- 
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Quantitative Monitoring of Gene Expression 
Patterns with aJDomplementary DNA Microarray 

Mark Schena,* Dari Shalon,*t Ronald W. Davis, 
Patrick O. Brown* 

A high-capacity system was developed to monitor the expression of many genes in 
parallel. Microarrays prepared by high-speed robotic printing of complementary DNAs on 
glass were used for quantitative expression measurements of the corresponding genes. 
Because of the small format and high density of the arrays, hybridization volumes of 2 
microliters could be used that enabled detection of rare transcripts in probe mixtures 
derived from 2 micrograms of total cellular messenger RNA. Differential expression 
measurements of 45 Arabidopsis genes were made by means of simultaneous,. two-color 
fluorescence hybridization. 
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The temporal, developmental, topographi- 
cal, histological, and physiological patterns 
in which a gene is expressed provide clues to 
its biological role. The large and expanding 
database of complementary DNA (cDNA) 
sequences from many organisms ( 1 ) presents 
the opportunity of defining these patterns at 
the level of the whole genome. 

For these studies, we used the small flow- 
ering plant Arabidopsis thaliana as a model 
organism. Arabidopsis possesses many ad- 
vantages for gene expression analysis, in- 
cluding the fact that it has the smallest 
genome of any higher eukaryote examined 
to date (2). Forty-five cloned Arabidopsis 
cDNAs (Table 1), including 14 complete 
sequences and 31 expressed sequence tags 
(ESTs), were used as gene-specific targets. 
We obtained the ESTs by selecting cDNA 
clones at random from an Arabidopsis 
cDNA library. Sequence analysts revealed 
that 28 of the 3 1 ESTs matched sequences 
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in the database (Table 1 ). Three additional 
cDN As from other organisms served as con- 
trols in the experiments. 

The 48 cDNAs, averaging -1.0 kb, 
were amplified with the polymerase chain 
reaction (PCR) and deposited into indi- 
vidual wells of a 96-well microtiter plate. 
Each sample was duplicated in two adja- 
cent wells to allow the reproducibility of 
the arraying and hybridization process to 
be tested. Samples from the microtiter 
plate were printed onto glass microscope 
slides in an area measuring 3.5 mm by 5.5 
mm with the use of a high-speed arraying 
machine (3). The arrays were processed by 
chemical and heat treatment to attach the 
DNA sequences to the glass surface and 
denature them (3). Three arrays, printed 
in a single lot, were used for the experi- 
ments here. A single microtiter plate of 
PCR products provides sufficient material 
to print at least 500 arrays. 

Fluorescent probes were prepared from 
total Arabidopsis mRNA (4) by a single 
round of reverse transcription (5). The Ara- 
bidopsis mRNA was supplemented with hu- 
man acetylcholine receptor (AChR) mRNA 
at a dilution of 1 : 10,000 ( w/w) before cDNA 
synthesis, to provide an internal standard for 
calibration (5). The resulting fluorescently 
labeled cDNA mixture was hybridaed to an 
array at high stringency (6) and scanned 
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vich a laser (3). A high-sensiiivicy scan gave 
ignals that saturated the detector at nearly 
ill of the Arabdopsis target sites (Fig. 1A). 
Calibration relative to the AChR mRNA 
xandard (Fig. 1A) established a sensitivity 
imit of - 1 : 50,000. No detectable hybridisa- 
:ion was observed to either the rat glucocor- 
:icoid receptor (Fig. 1A) or the yeast TRP4 
iFig. 1A) targets even at the highest scan- 
ning sensitivity. A moderate-sensitivity scan 



of the same array allowed linear detection of 
the more abundant transcripts (Fig. IB). 
Quantitation of both scans revealed a range 
of expression levels spanning three orders of 
magnitude for the 45 genes tested (Table 2). 
RNA blots (7) for several genes (Fig. 2) 
corroborated the expression levels measured 
with the microarray to within a factor of 5 
(Table 2). 

Differential gene expression was investi- 



gated with a simultaneous, two-color hy- 
bridization scheme, which served to min«. 
mize experimental variation inherent in the 
comparison of independent hybridisations. 
Fluorescent probes were prepared from ru-o 
mRNA sources with' the use of reverse tran- 
scriptase in the presence of fluorescein- and 
lissamine-labeled nucleotide analogs, re- 
spectively (5). The two probes were then 
mixed together in equal proportions, hy- 
bridized to a single array, and scanned sep- 
arately for fluorescein and lissamine emis- 
sion after independent excitation of the two 
fluorophores (3). 

To test whether overexpression of a sin- 
gle gene could be detected in a pool of total 
Arabidopsis mRN A* we used a microarray to 
analyze a transgenic line ovcrexpressing the 
single transcription factor HAT4 (8). Fluo- 
rescent probes representing mRNA from 
wild-type and HAT4- transgenic plants were 
labeled with fluorescein and lissamine, re- 
spectively; the two probes were then mixed 
and hybridized to a single array. An intense 
hybridization signal was observed at the 
position of the HAT4 cDNA in the lissa- 
mine -specific scan (Fig. ID), but not in the 
fluorescein-specific scan of the same array 
(Fig. 1C). Calibration with AChR mRNA 
added to the fluorescein and lissamine 
cDNA synthesis reactions at dilutions of 
1:10,000 (Fig. 1C) and 1:100 (Fig. ID), 
respectively, revealed a 50-fold elevation of 
HAT4 mRNA in the transgenic line rela- 
tive to its abundance in wild-type plants 
(Table 2). This magnitude of HAT4 over- 
expression matched that inferred from the 
Northern (RNA) analysis within a factor of 
2 (Fig. 2 and Table 2). Expression of all the 
other genes monitored on the array differed 
by less than a factor of 5 between HAT4- 
transgenic and wild-type plane (Fig 1, C 



Wild type transgenic 




1.0 0.1 0.01 1j0 0.1 0.01 



mRNA 



Human 
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Fig. 2. Gene expression monitored with RNA 
(Northern) blot analysis. Designated amounts of 
mRNA from wild-type and H474-transgenic 
plants were spotted onto nylon membranes and 
probed with the cONAs indicated. Purified human 
AChR mRNA was used for calibration. 
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ig. 1 . Gene expression monitored with the use of cDNA microarrays. Fluorescent scans represented in 
■seudocolor correspond to hybridization intensities. Cotor bars were calibrated from the signal obtained 
/ith the use of known concentrations of human AChR mRNA in independent experiments. Numbers and 
mers on the axes mark the position of each cDNA. (A) High-sensitivity fluorescein scan after hydndization 
mh fluorescein-labeled cDNA derived from wild-type plants. (B) Same array as in (A) but scanned at 
moderate sensitivity. (C and D) A single array was probed with a 1 : 1 mixture of fluorescein-labeled cDNA 
om wild-type plants and Hssamine-labeJed cDNA from KAT4 -transgenic plants. The single array was 
>en scanned success/very to detect the fluorescein fluorescence corresponding to mRNA from wild-type 
'ants (C) and the bssarrune fluorescence correspondtng to mRNA from HAT4-transgeruc plants (D). (E 
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and D. and Tabic 2). Hybridization of flu- 
orescein-labeled glucocorticoid receptor 
cDNA (Fig. 1C) and lissamine-labeled 
TRP4 cDNA (Fig. ID) verified the pres- 
ence of the negative control targets and the 
lack of optical cross talk between the two 
fluorophores. 

To explore a more complex alteration in 
expression patterns, we performed a second 
two-color hybridization experiment with 
fluorescein* and lissamine-labeled probes 
prepared from root and leaf mRNA, respec- 
tively. The scanning sensitivities for the 
two fluorophores were normalized by 
matching the signals resulting from AChR 



mRNA, which was added to both cDNA 
synthesis reactions at a dilution of 1:1000 
(Fig. 1 , E and F). A comparison of the scans 
revealed widespread differences in gene ex- 
pression between root and leaf tissue (Fig. 1, 
E and F). The mRNA from the light-regu- 
lated CAB! gene was -500-fold more abun- 
dant in leaf (Fig. IF) than in root tissue 
(Fig. IE). The expression of 26 other genes 
differed between root and leaf tissue by 
more than a factor of 5 (Fig. 1, E and F). 

The HAT4 -transgenic line we examined 
has elongated hypocotyls, early flowering, 
poor germination, and altered pigmentation 
(8). Although changes in expression were 



Table 1. Sequences contained on the cDNA microarray. Shown is the position, the known or putative 
function, and the accession number of each cDNA in the microarray (Fig. 1). All but three of the ESTs used 
in this study matched a sequence in the database. NADH. reduced form of nicotinamide adenine 
trinucleotide; ATPase. adenosine triphosphatase; GTP, guanosine triphosphate. 



Position 


cDNA 


Function 


Accession 






number 


a1, 2 


AChR 


Human Af^hR 




a3, 4 


EST3 


Acfjn 


H36235 


a5. 6 


EST6 


NADH dehvdmoenasA 




a7.8 


AAC1 


Acrjn 1 


■VkcUUlO 


a9, 10 


EST12 


Unknown 


UO0O94T 


an, 12 


EST13 


Actin 




bl.2 


CAB! 


Chlorophyll a/b binding 


MOO 1 OU 


b3. 4 


EST17 


PhosDhootvcefate kinase 


i*w*tyu 


b5. 6 


GA4 


Gibberellic acid biosynthesis 




b7.B • 


EST19 


Unknown 


U38595t 


b9. 10 


GBF-1 


G-box binding factor 1 


X63894 


b11. 12 


EST23 


Elongation factor 


X52256 


C1.2 


EST29 


Aldolase 


T04477 


C3.4 


GBF-2 . 


G-box binding factor 2 


X63895 


c5. 6 


EST34 


Chioroplast protease 


R87034 


C7.8 


EST35 


Unknown 


T14152 


c9, 10 


EST41 


CataJase 


T22720 


C11, 12 


rGR 


Rat glucocorticoid receptor 


M14053 


d1, 2 


EST42 


Unknown 


U36596t 


CJ3.4 


EST45 


ATPase 


J04185 


d5. 6 


HAT1 


Homeobox -leucine zipper 1 


U09332 


d7. 8 


EST46 


Ught harvesting complex 


T04063 


d9. 10 


EST49 


Unknown 


T76267 


dll, 12 


HAT2 


Homeobox-teucine zipper 2 


U09335 


el.2 


HAT4 


Homeobox-leucme zipper 4 


. M90394 


e3. 4 


EST50 


Phc«phoribulokinase 


T04344 


e5.6 


HATS 


Horneobox leucine zipper 5 


M90416 


e7, 6 


EST51 


Unknown 


233675 


e9. 10 


HAT22 


Horneobox-leucine zipper 22 


U09336 


ell. 12 


EST52 


Oxygen evolving 


T21749 


fi.2 


EST59 


Unknown 


234607 


f3.4 


KNAT1 


Knofled-tike homeobox 1 


U14174 


f5.6 


EST60 


RuBtsCO small subunrt 


X14564 


f7.8 


EST69 


Translation elongation factor 


T42799 


f9. 10 


PPH1 


Protein phosphatase 1 


U34803 


f11. 12 


EST70 


Unknown 


T44621 


91.2 


EST75 


Chloroptast protease 


T43698 


g3.4 


EST 78 


Unknown 


R65481 


g5,6 


ROC1 


Cyctophifan 


L14844 


g7.8 


EST82 


GTP binding 


X59152 


99, 10 


EST83 


Unknown 


233795 


gii. 12 


EST84 


Unknown 


T45278 


M.2 


EST91 


Unknown 


T13832 


h3.4 


EST96 


Unknown 


R64816 


h5.6 


SARI 


Synaptobrevin 


M90418 


h7,8 


EST100 


Ught harvesting complex 


218205 


h9. 10 


EST103 


Ught harvesting complex 


X03909 


h11. 12 


TRP4 


Yeast tryptophan biosynthesis 


X04273 



observed for HAT«, Urge changes in ex. 
pression were not observed for anv of the 
other 44 genes we examined. This .was 
somewhat surprising, particularly because 
comparative analysis of leaf and root tissue 
identified 27 differentially expressed genes. 
Analysis of an expanded set of genes mav be 
required to identify genes whose expression 
changes upon HAT4 overexpression; alter- 
natively, a comparison of mRNA popula- 
tions from specific tissues of wild-type and 
HAT4-transgenic plants may allow identi- 
fication of downstream genes. 

At the current density of robotic pnnting, 
it is feasible to scale up the fabrication pro- 
cess to produce arrays containing 20,000 
cDNA targets. At this density, a single array 
would be sufficient to provide gene-specific 
targets encompassing nearly the entire rep- 
ertoire of expressed genes in the Arabidopsis 
genome (2). The availability of 20,274 ESTs 
from Arabidopsis (1,9) would provide a rich 
source of templates for such studies. 

The estimated 100,000 genes in the hu- 
man genome (JO) exceeds the number of 
Arabidopsis genes by a factor of 5 (2). This 
modest increase in complexity suggests that 
similar cDNA microarrays, prepared from 
the rapidly growing repertoire of human 
ESTs (J), could be used to determine the 
expression patterns of tens of thousands of 
human genes in diverse cell types. Coupling 
an amplification strategy to the reverse 
transcription reaction (11) could make it 
feasible to monitor expression even in 
minute tissue samples. A wide variety of 
acute and chronic physiological and patho- 
logical conditions might lead to character- 
istic changes in the patterns of gene expres- 
sion in peripheral blood cells or other easily 
sampled tissues. In concert with cDN A mi- 
croarrays for monitoring complex expres- 
sion patterns, these tissues might therefore 
serve as sensitive in vivo sensors for clinical 
diagnosis. Microarrays of cDNAs could thus 
provide a useful link between human gene 
sequences and clinical medicine. 



Table 2. Gene expression monitoring by mteroar- 
ray and RNA Wot analyses; tg. HATl-transgenic. 
See Table 1 for additional gene intonation. Ex- 
pression levels (w/w) were calibrated with the use 
of known amounts of human AChR mRNA. Values 
for the microarray were deterrrwied from microar- 
ray scans (Fig. 1); values for the RNA blot were 
determined from RNA blots (Rg. 2). 



•Proprietary sequence of Stratagene (La Jotta, Cafflomia). tNo match in the database: novel EST. 



Gene 


Expression level (w/w) 


Microarray 


RNA blot 


CAB/ 


1:48 


1:63 


C4S/(tg) 


1:120 


1:150 


HAT 4 


1:8300 


1:6300 


HAT4 (tg) 


1:150 


1410 


ROC1 


1:1200 


1:1800 


ROC1 (tg) 


1:260 


1:1300 
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Gene Therapy in Peripheral Blood 
Lymphocytes and Bone Marrow for 
ADA~ Immunodeficient Patients 

Claudio Bordignon,* Luigi D. Notarangelo, Nadia Nobili, 
Giuliana Ferrari, Giulia Casorati, Paola Panina, Evelina Mazzolari, 
Daniela Maggioni, Claudia Rossi, Paolo Servida, 
Alberto G, Ugazio, Fulvio Mavilio 

Adenosine deaminase {ADA) deficiency results in severe combined immunodeficiency, 
the first genetic disorder treated by gene therapy. Two different retroviral vectors were 
used to transfer ex vivo the human ADA minigene into bone marrow cells and peripheral 
blood lymphocytes from two patients undergoing exogenous enzyme replacement ther- 
apy. After 2 years of treatment long-term survival of T and B lymphocytes, marrow cells, 
and granulocytes expressing the transferred ADA gene was demonstrated and resulted 
in normalization of the immune repertoire and restoration of cellular and humoral immunity. 
After discontinuation of treatment, T lymphocytes, derived from transduced peripheral 
blood lymphocytes,- were progressively replaced by marrow-derived T cells in both pa- 
tients. These results indicate successful gene transfer into long-lasting progenitor cells, 
producing a functional multilineage progeny. 



Severe combined immunodeficiency asso- 
ciated with inherited deficiency of ADA 

(1) is usually fatal unless affected children 
axe kept in protective isolation or the im- 
mune system is reconstituted by bone mar- 
row transplantation from a human leuko- 
cyte antigen (HLA)-identical sibling donor 

(2) . This is the therapy of choice, although 
it is available only for a minority of patients. 
In recent years, other forms of therapy have 
been developed, including transplants from 
haploidentical donors (3, A), exogenous en- 
zyme replacement (5), and somatic -cell 
gene therapy (6-9). 

We previously reported a preclinical mod- 
el in which ADA gene transfer and expression 
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successfully restored immune functions in hu- 
man ADA-deficient (ADA") peripheral 
blood lymphocytes (PBLs) in immunodefi- 
cient mice in vivo (JO, 11). On the basis of 
these preclinical results, the clinical applica- 
tion of gene therapy for the treatment of 
ADA" SCID (severe combined immunodefi- 
ciency disease) patients who previously failed 
exogenous enzyme replacement therapy was 
approved by our Institutional Ethical Com- 
mittees and by the Italian National Commit- 
tee for Bioethics (12). In addition to evaluat- 
ing the safety and efficacy of the gene therapy 
procedure, the aim of the study was to define 
the relative role of PBLs and hematopoietic 
stem cells in the long-term ^constitution of 
immune functions after retroviral vector-me- 
diated ADA gene transfer. For this purpose, 
two structurally identical vectors expressing 
the human ADA complementary DNA 
(cDNA), distinguishable by the presence of 
alternative restriction sites in a nonfunctional 
region of the viral long-terminal repeat 
(LTR), were used to transduce PBLs and bone 
marrow (BM) cells independently. This pro- 
cedure allowed identification of die origin of 
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METHOD AMD ACTAp miTTa wov TiBPTnmTr ff 

Pield of the Invention 
5 This invention relates to a method and apparatus 

for fabricating microarrays of biological samples for 
large scale screening assays, such as arrays of DNA 
samples to be used in DNA hybridization assays for 
genetic research and diagnostic applications* 

10 
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Backgro und of the Invention 

A variety of methods are currently available for 
making arrays of biological macromolecules , such as 

10 arrays of nucleic acid molecules or proteins. One 
method for making ordered arrays of DNA on a porous 
membrane is a "dot blot" approach. In this method, a 
vacuum manifold transfers a plurality, e.g., 96, 
aqueous samples of DNA from 3 millimeter diameter wells 

15 to a porous membrane. A common variant of this 

procedure is a "slot-blot" method in which the wells 
have highly-elongated oval shapes. 

The DNA is immobilized on the porous membrane by 
baking the membrane or exposing it to UV radiation. 

20 This is a manual procedure practical for making one 

array at a time and usually limited to 96 samples per 
array. "Dot-blot" procedures are therefore inadequate 
for applications in which many thousand samples must be 
determined. 

25 A more efficient technique employed for making 

ordered arrays of genomic fragments uses an array of 
pins dipped into the wells, e.g., the 96 wells of a 
microtitre plate, for transferring an array of samples 
to a substrate, such as a porous membrane. One array 

30 includes pins that are designed to spot a membrane in a 
staggered fashion, for creating an array of 9216 spots 
in a 22 x 22 cm area (Lehrach, et al., 1990). A 
limitation with this approach is that the volume of DNA 
spotted in each pixel of each array is highly variable. 
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In addition, the number of arrays that can be made with 
each dipping is usually quite small. 

An alternate method of creating ordered arrays of 
nucleic acid sequences is described by Pirrung, et al. 
5 (1992), and also by Fodor, et al. (1991). The method 
involves synthesizing different nucleic acid sequences 
at different discrete regions of a support. This 
method employs elaborate synthetic schemes, and is 
generally limited to relatively short nucleic acid 
10 sample, e.g., less than 20 bases. A related method has 
been described by Southern, et al. (1992). 

Khrapko, et al. (1991) describes a method of 
making an oligonucleotide matrix by spotting DNA onto a 
thin layer of polyacrylamide. The spotting is done 
15 manually with a micropipette. 

None of the methods or devices described in the 
prior art are designed for mass fabrication o* 
microarrays characterized by (i) a large number of 
micro-sized assay regions separated by a distance of 
20 50-200 microns or less, and (ii) a well-defined amount, 
typically in the picomole range, of analyte associated 
with each region of the array. 

Furthermore , current technology is directed at 
performing such assays one at a time to a single array 
25 of DNA molecules. For example, the most common method 
for performing DNA hybridizations to arrays spotted 
onto, porous membrane involves sealing the membrane in a 
plastic bag (Maniatas, et al M 1989) or a rotating 
glass cylinder (Bobbins Scientific) with the labeled 
30 hybridization probe inside the sealed chamber. For 
arrays made on non-porous surfaces, such as a 
microscope slide, each array is incubated with the 
labeled hybridization probe sealed under a coverslip. 
These techniques require a separate sealed chamber for 
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each array which makes the screening and handling of 
many such arrays inconvenient and time intensive. 

Abouzied, et al. (1994) describes a method of 
printing horizontal lines of antibodies on a 
5 nitrocellulose membrane and separating regions of the 
membrane with vertical stripes of a hydrophobic 
material. Each vertical stripe is then reacted with a 
different antigen and the reaction between the 
immobilized antibody and an antigen is detected using a 
10 standard ELISA colorimetric technique. Abouzied 's 
technique makes it possible to screen many one- 
dimensional arrays simultaneously on a single sheet of 
nitrocellulose. Abouzied makes the nitrocellulose 
somewhat hydrophobic using a line drawn with PAP Pen 
15 (Research Products International) . However Abouzied 
does not describe a technology that is capable of 
completely sealing the pores of the nitrocellulose. The 
pores of the nitrocellulose are still physically open 
and so the assay reagents can leak through the 
20 hydrophobic barrier during extended high temperature 
incubations or in the presence of detergents which 
makes the Abouzied technique unacceptable for DNA 
hybridization assays. 

Porous membranes with printed patterns of 
25 hydrophilic/hydrophobic regions exist for applications 
such as ordered arrays of bacteria colonies. QA Life 
Sciences (San Diego CA) makes such a membrane with a 
grid pattern printed on it. However, this membrane has 
the same disadvantage as the Abouzied technique since 
reagents can still flow between the gridded arrays 
making them unusable for separate DNA hybridization 
assays . 

Pall Corporation make a 96-well plate with a 
porous filter heat sealed to the bottom of the plate. 
35 These plates are capable of containing different 
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reagents in each well without cross-contamination. 
However, each well is intended to hold only one target 
element whereas the invention described here makes a 
microarray of many biomolecules in each subdivided 
region of the solid support. Furthermore, the 96 well 
plates are at least 1 cm thick and prevent the use of 
the device for many colorimetric, fluorescent and 
radioactive detection formats which require that the 
membrane lie flat against the detection surface. The 
invention described here requires no further processing 
after the assay step since the barriers elements are 
shallow and do not interfere with the detection step 
thereby greatly increasing convenience, 

Hyseq Corporation has described a method of making 
an "array of arrays" on a non-porous solid support for 
use with their sequencing by hybridization technique. 
The method described by Hyseq involves modifying the 
chemistry of the solid support material to form a 
hydrophobic grid pattern where each subdivided region 
contains a microarray of biomolecules. Hyseq 's flat 
hydrophobic pattern does not make use of physical 
blocking as an additional means of preventing cross 
contamination. 



25 awmrnn-Y of the Invention 

The invention includes, in one aspect, a method of 
forming a microarray of analyte-assay regions on a 
solid support, where each region in the array has a 
known amount of a selected, analyte-specif ic reagent. 

30 The method involves first loading a solution of a 
selected analyte-specif ic reagent in a reagent- 
dispensing device having an elongate capillary channel 
(i) formed by spaced-apart , coextensive elongate 
members, (ii) adapted to hold a quantity of the reagent 

35 solution and (iii) having a tip region at which aqueous 
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solution in the channel forms a meniscus. The channel 
is preferably formed by a pair of spaced-apart tapered 
elements . 

The tip of the dispensing device is tapped against 
5 a solid support at a defined position on the support 

surface with an impulse effective to. break the meniscus 
in the capillary channel deposit a selected volume of 
solution on the surface, preferably a selected volume 
in the range 0.01 to 100 nl. The two steps are 

10 repeated until the desired array is formed. 

The method may be practiced in forming a plurality 
of such arrays, where the solution-depositing step is 
are applied to a selected position on each of a 
plurality of solid supports at each repeat cycle. 

15 The dispensing device may be loaded with a new 

solution, by the steps of (i) dipping the capillary 
channel of the device in a wash solution, (ii) removing 
wash solution drawn into the capillary channel, and 
(iii) dipping the capillary channel into the new 

20 reagent solution* 

Also included in the invention is an automated 
apparatus for forming a microarray of analyte-assay 
regions on a plurality of solid supports, where each 
region in the array has a known amount of a selected, 

25 analyte-specif ic reagent. The apparatus has a holder 
for holding, at known positions, a plurality of planar 
supports, and a reagent dispensing device of the type 
described above. 

The apparatus further includes positioning 

30 structure for positioning the dispensing device at a 
selected array position with respect to a support in 
said holder, and dispensing structure for moving the 
dispensing device into tapping engagement against a 
support with a selected impulse effective to deposit a 
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selected volume on the support, e.g., a selected volume 
in the volume range 0.01 to 100 nl. 

The positioning and dispensing structures are 
controlled by a control unit in the apparatus. The 
5 unit operates to (i) place the dispensing device at a 
loading station, (ii) move the capillary channel in the 
device into a selected reagent at the loading station, 
to load the dispensing device with the reagent, and 
(iii) dispense the reagent at a defined array position 
10 on each of the supports on said holder. The unit may 
further operate, at the end of a dispensing cycle, to 
wash the dispensing device by (i) placing the 
dispensing device at a washing station, (ii) moving the 
capillary channel in the device into a wash fluid, to 
15 load the dispensing device with the fluid, and (iii) 
remove the wash fluid prior to loading the dispensing 
device with a fresh selected reagent. 

The dispensing device in the apparatus may be one 
of a plurality of such devices which are carried on the 
arm for dispensing different analyte assay reagents at 
selected spaced array positions. 

In another aspect, the invention includes a 
substrate with a surface having a microarray of at 
least 10 3 distinct polynucleotide or polypeptide 
25 biopolymers in a surface area of less than about l cm 3 . 
Each distinct biopolymer (i) is disposed at a separate, 
defined position in said array, (ii) has a length of at 
least 50 subunits, and (iii) is present in a defined 
amount between about 0.1 femtomoles and 100 nanomoles. 

In one embodiment, the surface is glass slide 
surface coated with a polycationic polymer, such as 
poly lysine, and the biopolymers are polynucleotides. 
In another embodiment, the substrate has a water- 
impermeable backing, a water-permeable film formed on 
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the backing, and a grid formed on the film. The grid 
is composed of intersecting water-impervious grid 
elements extending from said backing to positions 
raised above the surface of said film, and partitions 
5 the film into a plurality of water-impervious cells. A 
biopolymer array is formed within each well. 

More generally, there is provided a substrate for 
use in detecting binding of labeled polynucleotides to 
one or more of a plurality different-sequence, 

10 immobilized polynucleotides. The substrate includes, 
in one aspect, a glass support, a coating of a 
polycationic polymer, such as poly lysine, on said 
surface of the support, and an array of distinct 
polynucleotides electrostatically bound rion-covalently 

15 to said coating, where each distinct biopolymer is 

disposed at a separate, defined position in a surface 
array of polynucleotides • 

In another aspect, the substrate includes a water- 
impermeable backing, a water-permeable film formed on 

20 the backing, and a grid formed on the film, where the 
grid is composed of intersecting water- impervious grid 
elements extending from the backing to positions raised 
above the surface of the film, forming a plurality of 
cells. A biopolymer array is formed within each cell. 

25 Also forming part of the invention is a method of 

detecting differential expression of each of a 
plurality of genes in a first cell type, with respect 
to expression of the same genes in a second cell type. 
In practicing the method, there is first produced 

30 fluorescent-labeled cDNA's from mRNA's isolated from 
the two cells types, where the cDNA'S from the first 
and second cells are labeled with first and second 
different fluorescent reporters. 

A mixture of the labeled cDNA's from the two cell 

35 types is added to an array of polynucleotides 
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representing a plurality of known genes derived from 
the two cell types, under conditions that result in 
hybridization of the cDNA's to complementary-sequence 
polynucleotides in the array. The array is then 
5 examined by fluorescence under fluorescence excitation 
conditions in which (i) polynucleotides in the array 
that are hybridized predominantly to cDNA's derived 
from one of the first and second cell types give a 
distinct first or second fluorescence emission color, 
10 respectively, and (ii) polynucleotides in the array 

that are hybridized to substantially equal numbers of 
cDNA's derived from the first and second cell types 
give a distinct combined fluorescence emission color, 
respectively. The relative expression of known genes 
in the two cell types can then be determined by the 
observed fluorescence emission color of each spot. 

These and other objects and features of the 
invention will become more fully apparent when the 
following detailed description of the invention is read 
in conjunction with the accompanying figures. 

Brief Description of the Drawings 
Pig. 1 is a side view of a reagent-dispensing 
device having a open-capillary dispensing head 
constructed for use in one embodiment of the invention; 

Figs. 2A-2C illustrate steps in the delivery of a 
fixed-volume bead on a hydrophobic surface employing 
the dispensing head from Fig. 1, in accordance with one 
embodiment of the method of the invention; 

Fig. 3 shows a portion of a two-dimensional array 
of analyte-assay regions constructed according to the 
method of the invention; 

Fig. 4 is a planar view showing components of an 
automated apparatus for forming arrays in accordance 
with the invention. 
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Fig. 5 shows a fluorescent image of an actual 20 x 
20 array of 400 f luorescently-labeled OKA samples 
immobilized on a poly-l-lysine coated slide, where the 
total area covered by the 400 element array is 16 
5 square millimeters; 

Fig. 6 is a fluorescent image ofal.8cmxi.8cm 
microarray containing lambda clones with yeast inserts, 
the fluorescent signal arising from the hybridization 
to the array with approximately half the yeast genome 
10 labeled with a green f luorophore and the other half 
with a red f luorophore; 

Fig. 7 shows the translation of the hybridization 
image of Fig. € into a karyotype of the yeast genome, 
where the elements of Fig. -6 microarray contain yeast 
15 DNA sequences that have been previously physically 
mapped in the yeast genome; 

Fig. 8 show a fluorescent image ofa0.5cmxo.5 
cm microarray of 24 cDNA clones, where the microarray 
was hybridized simultaneously with total cDNA from wild 
20 type Arabidopsis plant labeled with a green f luorophore 
and total cDNA from a transgenic Arabidopsis plant 
labeled with a red f luorophore, and the arrow points to 
the cDNA clone representing the gene introduced into 
the transgenic Arabidopsis plant; 
25 Fig. 9 shows a plan view of substrate having an 

array of cells formed by barrier elements in the form 
of a grid; 

Fig. 10 shows an enlarged plan view of one of the 
cells in the substrate in Fig. 9, showing an array of 
30 polynucleotide regions in the cell; 

Fig. 11 is an enlarged sectional view of the 
substrate in Fig. 9, taken along a section line in that 
figure; and 

Fig. 12 is a scanned image of a 3 cm x 3 cm 
35 nitrocellulose solid support containing four identical 
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arrays of M13 clones in each of four quadrants, where 
each quadrant was hybridized simultaneously to a 
different oligonucleotide using an open face 
hybridization method. 

5 

Detailed D escription of the Invention 

i. Definitions 

Unless indicated otherwise, the terms defined 
below have the following meanings: 

10 "Ligand" refers to one member of a ligand/anti- 

ligand binding pair. The ligand may be, for example, 
one of the nucleic acid strands in a complementary, 
hybridized nucleic acid duplex binding pair; an 
effector molecule in an effector /receptor binding pair; 

15 or an antigen in an antigen/ antibody or 
antigeih/antibody fragment binding pair. 

"Antiligand" refers to the opposite member of a 
ligand/anti-ligand binding pair. The antiligand may be 
the other of the nucleic acid strands in a 

20 complementary, hybridized nucleic acid duplex binding 
pair; the receptor molecule in an effector /receptor 
binding pair; or an antibody or antibody fragment 
molecule in antigen/ antibody or antigen /antibody 
fragment binding pair, respectively. 

25 "Analyte" or "analyte molecule" refers to a 

molecule, typically a macromolecule, such as a 
polynucleotide or polypeptide, whose presence, amount, 
and/ or identity are to be determined. The analyte is 
one member of a ligand/anti-ligand pair. 

30 "Analyte-specif ic assay reagent" refers to a 

molecule effective to bind specifically to an analyte 
molecule. The reagent is the opposite member of a 
ligand/anti-ligand binding pair. 

An "array of regions on a solid support" is a 

35 linear or two-dimensional array of preferably discrete 
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regions, each having a finite area, formed on the 
surface of a solid support. 

A "microarray" is an array of regions having a 
density of discrete regions of at least about 100/cm 2 , 
and preferably at least about iooo/cm J . The regions in 
a microarray have typical dimensions, e.g., diameters, 
in the range of between about 10-250 /urn, and are 
separated from other regions in the array by about the 
same distance. 

A support surface is "hydrophobic" if a aqueous- 
medium droplet applied to the surface does not spread 
out substantially beyond the area size of the applied 
droplet. That is, the surface acts to prevent 
spreading of the droplet applied to the surface by 
15 hydrophobic interaction with the droplet. 

A "meniscus" means a concave or convex surface 
that forms on the bottom of a liquid in a channel as a 
result of the surface tension of the liquid. 

"Distinct biopolymers", as applied to the 
20 biopolymers forming a microarray, means an array member 
which is distinct from other array members on the basis 
of a different biopolymer sequence, and/or different 
concentrations of the same or distinct biopolymers, 
and/or different mixtures of distinct or different- 
25 concentration biopolymers. Thus an array of "distinct 
polynucleotides" means an array containing, as its 
members, (i) distinct polynucleotides, which may have a 
defined amount in each member, (ii) different, graded 
concentrations of given-sequence polynucleotides, 
30 and/or (iii) different-composition mixtures of two or 
more distinct polynucleotides. 

"Cell type" means a cell from a given source, 
e.g., a tissue, or organ, or a cell in a given state of 
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differentiation, or a cell associated with a given 
pathology or genetic makeup. 

IJ - Method of Microa^Y Format^ 

This section describes a method of forming a 
microarray of analyte-assay regions on a solid support 
or substrate, where each region in the array has a 
known amount of a selected, analyte-specif ic reagent. 

Fig. l illustrates, in a partially schematic view, 
a reagent-dispen-ing device 10 useful in practicing the 
method. The device generally includes a reagent 
dispenser 12 having an elongate open capillary . channel 
14 adapted to hold a quantity of the reagent solution, 
such as indicated at 16, as will be described below. 
The capillary channel is formed by a pair of spaced- 
apart, coextensive, elongate members 12a, 12b which are 
tapered toward one another and converge at a tip or tip 
region 18 at the lower end of the channel. More 
generally, the open channel is "formed by at least two 
elongate, spaced-apart members adapted to hold a 
quantity of reagent solutions and having a tip region 
at which aqueous solution in the channel forms a 
meniscus, such as the concave meniscus illustrated at 
20 in Pig. 2A. The advantages of the open channel 
construction of the dispenser are discussed below. 

With continued reference to Fig. 1, the dispenser 
device also includes structure for moving the dispenser 
rapidly toward and away from a support surface, for 
effecting deposition of a known amount of solution in 
the dispenser on a support, as will be described below 
with reference to Figs. 2A-2C. m the embodiment 
shown, this structure includes a solenoid 22 which is 
activatable to draw a solenoid piston 24 rapidly 
downwardly, then release the piston, e.g., under spring 
35 bias, to a normal, raised position, as shown. The 
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dispenser is carried on the piston by a connecting 
member 26 , as shown. The just-described moving 
structure is also referred to herein as dispensing 
means for moving the dispenser into engagement with a 
5 solid support, for dispensing a known volume of fluid 
on the support* 

The dispensing device just described is carried on 
an arm 28 that may be moved either linearly or in an x- 
y plane to position the dispenser at a selected 

10 deposition position, as will be described. 

Figs. 2A-2C illustrate the method of depositing a 
known amount of reagent solution in the just-described 
dispenser on the surface of a solid support, such as 
the support indicated at 30. The support is a polymer, 

15 glass, or other solid-material support having a surface 
indicated at 31. 

In one general embodiment, the surface is a 
relatively hydrophilic, I.e., wettable surface, such as 
a surface having native, bound or covalently attached 

20 charged groups. On such surface described below is a 
glass surface having an absorbed layer of a 
polycationic polymer, such as poly-l-lysine. 

In another embodiment, the surface has or is 
formed to have a relatively hydrophobic character, 

25 i.e., one that causes aqueous medium deposited on the 
surface to bead. A variety of known hydrophobic 
polymers, such as polystyrene, polypropylene, or 
polyethylene have desired hydrophobic properties, as do 
glass and a variety of lubricant or other hydrophobic 

30 films that may be applied to the support surface. 

Initially, the dispenser is loaded with a selected 
analyte-specif ic reagent solution, such as by dipping 
the dispenser tip, after washing, into a solution of 
the reagent, and allowing filling by capillary flow 

35 into the dispenser channel. The dispenser is now moved 
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to a selected position with respect to a support 
surface, placing the dispenser tip directly above the 
support-surface position at which the reagent is to be 
deposited. This movement takes place with the 
dispenser tip in its raised position, as seen in Fig. 
2A, where the tip is typically at least several 1-5 mm 
above the surface of the substrate. 

With the dispenser so positioned, solenoid 22 is 
now activated to cause the dispenser tip to move 
rapidly toward and away from the substrate surface, 
making momentary contact with the surface, in effect, 
tapping the tip of the dispenser against the support 
surface. The tapping movement of the tip against the 
surface acts to break the liquid meniscus in the tip 
channel, bringing the liguid in the tip into contact 
with the support surface. This, in turn, produces a 
flowing of the liguid into the capillary space between 
the tip and the surface, acting to draw liquid out of 
the dispenser channel, as seen in Pig. 2B. 

Pig. 2C shows flow of fluid from the tip onto the 
support surface, which in this case is a hydrophobic 
surface. The figure illustrates that liquid continues 
to flow from the dispenser onto the support surface 
until it forms a liquid bead 32. At a given bead size, 
i.e., volume, the tendency of liquid to flow onto the 
surface will be balanced by the hydrophobic surface 
interaction of the bead with the support surface, which 
acts to limit the total bead area on the surface, and 
by the surface tension of the droplet, which tends 
3 0 toward a given bead curvature. At this point, a given 
bead volume will have formed, and continued contact of 
the dispenser tip with the bead, as the dispenser tip 
is being withdrawn, will have little or no effect on 
bead volume. 
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For liquid-dispensing on a more hydrophilic 
surface, the liquid will have less of a tendency to 
bead, and the dispensed volume will be more sensitive 
to the total dwell time of the dispenser tip in the 
5 immediate vicinity of the support surface, e.g., the 
positions illustrated in Figs. 2B and 2C. 

The desired deposition volume, i.e., bead volume, 
formed by this method is preferably in the range 2 pi 
(picoliters) to 2 nl (nanoliters) , although volumes as 
10 high as 100 nl or more may be dispensed. It will be 
appreciated that the selected dispensed volume will 
depend on (i) the "footprint" of the dispenser tip, 
i.e., the size of the area spanned by the tip, (ii) the 
hydrophobicity of the support surface, and (iii) the 
15 time of contact with and rate of withdrawal of the tip 
from the support surface. In addition, bead size may 
be reduced by increasing the viscosity of the medium, 
effectively reducing the flow time of liquid from the 
dispenser onto the support surface. The drop size may 
20 be further constrained by depositing the drop in a 
hydrophilic region surrounded by a hydrophobic grid 
pattern on the support surface. 

In a typical embodiment, the dispenser tip is 
tapped rapidly against the support surface, with a 
25 total residence time in contact with the support of 
less than about 1 msec, and a rate of upward travel 
from the surface of about 10 cm/sec. 

Assuming that the bead that forms on contact with 
the surface is a hemispherical bead, with a diameter 
30 approximately equal to the width of the dispenser tip, 
as shown in Fig. 2C, the volume of the bead formed in 
relation to dispenser tip width (d) is given in Table l 
below. As seen, the volume of the bead ranges between 
2 pi to 2 nl as the width size is increased from about 
35 20 to 200 /in- 
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Table l 





Voluoe (nl) 


20 vm 


2 x io° 


50 tim 


3.1 x 10' 2 


100 fim 


2.5 x 10' 1 


| 200 Mm 


2 



35 



At a given tip size, bead volume can be reduced in 
a controlled fashion by increasing surface 
hydrophobicity, reducing time of contact of the tip 
with the surface, increasing rate of movement of the 
tip away from the surface, and/or increasing the 
viscosity of the medium. Once these parameters are 
fixed, a selected deposition volume in the desired pi 
to nl range can be achieved in a repeatable fashion. 

After depositing a bead at one selected location 
on a support, the tip is typically moved to a 
corresponding position on a second support, a droplet 
is deposited at that position, and this process is 
repeated until a liguid droplet of the reagent has been 
deposited at a selected position on each of a plurality 
of supports. 

The tip is then washed to remove the reagent 
liguid, filled with another reagent liguid and this 
reagent is now deposited at each another array position 
on each of the supports. In one embodiment, the tip is 
washed and refilled by the steps of (i) dipping the 
capillary channel of the device in a wash solution, 
(ii) removing wash solution drawn into the capillary 
channel, and (iii) dipping the capillary channel into 
the new reagent solution. 

Prom the foregoing, it will be appreciated that 
the twee2ers-like, open-capillary dispenser tip 
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provides the advantages that (i) the open channel of 
the tip facilitates rapid, efficient washing and drying 
before reloading the tip with a new reagent , (ii) 
passive capillary action can load the sample directly 
5 from a standard microwell plate while retaining 

sufficient sample in the open capillary reservoir for 
the printing of numerous arrays , (iii) open capillaries 
are less prone to clogging than closed capillaries, and 
(iv) open capillaries do not require a perfectly faced 

10 bottom surface for fluid delivery. 

A portion of a microarray 36 formed on the surface 
38 of a solid support 40 in accordance with the method 
just described is shown in Fig, 3. The array is formed 
of a plurality of analyte-specif ic reagent regions, 

15 such as regions 42, where each region may include a 
different analyte-specif ic reagent. As indicated 
above, the diameter of each region is preferably 
•between about 20-200 jim. The spacing between each 
region and its closest (non-diagonal) neighbor, 

20 measured from center-to-center (indicated at 44) , is 

preferably in the range of about 20-400 /im. Thus, for 
example, an array having a center-to- center spacing of 
about 250 /im contains about 40 regions/cm or 1,600 
regions/ cm 2 . After formation of the array, the support 

25 is treated to evaporate the liquid of the droplet 

forming each region, to leave a desired array of dried, 
relatively flat regions. This drying may be done by 
heating or under vacuum. 

In some cases, it is desired to first rehydrate 

30 the droplets containing the analyte reagents to allow 
for more time for adsorption to the solid support. It 
is also possible to spot out the analyte reagents in a 
humid environment so that droplets do not dry until the 
arraying operation is complete. 
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III* Automated Apparatus fQr Fn pninq *Tr flVf 

In another aspect, the invention includes an 
automated apparatus for forming an array of analyte- 
assay regions on a solid support, where each region in 
5 the array has a known amount of a selected, analyte- 
specific reagent. 

The apparatus is shown in planar, and partially 
schematic view in Fig. 4. A dispenser device 72 in the 
apparatus has the basic construction described above 

10 with respect to Fig. 1, and includes a dispenser 74 

having an open-capillary channel terminating at a tip, 
substantially as shown in Figs. 1 and 2A-2C. 

The dispenser is mounted in the device for 
movement toward and away from a dispensing position at 

15 which the tip of the dispenser taps a support surface, 
to dispense a selected volume of reagent solution, as 
described above. This movement is effected by a 
solenoid 76 as described above. Solenoid 76 is under 
the control of a control unit 77 whose operation will 

20 be described below. The solenoid is also referred to 
herein as dispensing means for moving the device into 
tapping engagement with a support, when the device is 
positioned at a defined array position with respect to 
that support. 

25 The dispenser device is carried on an arm 74 which 

is threadedly mounted on a worm screw 80 driven 
(rotated) in a desired direction by a stepper motor 82 
also under the control of unit 77. At its left end in 
the figure screw 80 is carried in a sleeve 84 for 

30 rotation about the screw axis. At its other end, the 
screw is mounted to the drive shaft of the stepper 
motor, which in turn is carried on a sleeve 86. The 
dispenser device, worm screw, the two sleeves mounting 
the worm screw, and the stepper motor used in moving 

35 the device in the "x M (horizontal) direction in the 
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figure form what is referred to here collectively as a 
displacement assembly 86. 

The displacement assembly is constructed to 
produce precise, micro-range movement in the direction 
5 of the screw, i.e., along an x axis in the figure. In 
one mode, the assembly functions to move the dispenser 
in x-axis increments having a selected distance in the 
range 5-25 pm. In another mode, the dispenser unit may 
be moved in precise x-axis increments of several 

10 microns or more,; for positioning the dispenser at 

associated positions on adjacent supports, as will be 
described below. 

The displacement assembly, in turn, is mounted for 
movement in the "y" (vertical) axis of the figure, for 

15 positioning the dispenser at a selected y axis 

position. The structure mounting the assembly includes 
a fixed rod 88 mounted rigidly between a pair of frame 
bars 90, 92, and a worm screw 94 mounted for rotation 
between a pair of frame bars 96, 98. The worm screw is 

20 driven (rotated) by a stepper motor 100 which operates 
under the control of unit 77. The motor is mounted on 
bar 96, as shown. 

The structure just described, including worm screw 
94 and motor 100, is constructed to produce precise, 

25 micro-range movement in the direction of the screw, 
i.e., along an y axis in the figure. As above, the 
structure functions in one mode to move the dispenser 
in y-axis increments having a selected distance in the 
range 5-250 pm, and in a second mode, to move the 

30 dispenser in precise y-axis increments of several 

microns (pm) or more, for positioning the dispenser at 
associated positions on adjacent supports. 

The displacement assembly and structure for moving 
this assembly in the y axis are referred to herein 

35 collectively as positioning means for positioning the 
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dispensing device at a selected array position with 
respect to a support. 

A holder 102 in the apparatus functions to hold a 
plurality of supports, such as supports 104 on which 
5 the microarrays of regent regions are to be formed by 
the apparatus. The holder provides a number of 
recessed slots, such as slot 106, which receive the 
supports, and position them at precise selected 
positions with respect to the frame bars on which the 

10 dispenser moving means is mounted. 

As noted above, the control unit in the device 
functions to actuate the two stepper motors and 
dispenser solenoid in a sequence designed for automated 
operation of the apparatus in forming a selected 

15 microarray of reagent regions on each of a plurality of 
supports. 

The control unit is constructed, according to 
conventional microprocessor control principles, to 
provide appropriate signals to each of the solenoid and 

20 each of the stepper motors, in a given timed sequence 
and for appropriate signalling time. The construction 
of the unit, and the settings that are selected by the 
user to achieve a desired array pattern, will be 
understood from the following description of a typical 

25 apparatus operation. 

Initially, one or more supports are placed in one 
or more slots in the holder. The dispenser is then 
moved to a position directly above a well (not shown) 
containing a solution of the first reagent to be 

30 dispensed on the support (s) . The dispenser solenoid is 
actuated now to lower the dispenser tip into this well, 
causing the capillary channel in the dispenser to fill. 
Motors 82, 100 are now actuated to position the 
dispenser at a selected array position at the first of 

35 the supports. Solenoid actuation of the dispenser is 
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then effective to dispense a selected-volume droplet of 
that reagent at this location. As noted above, this 
operation is effective to dispense a selected volume 
preferably between 2 pi and 2 nl of the reagent 
5 solution. 

The dispenser is now moved to the corresponding 
position at an adjacent support and a similar volume of 
the solution is dispensed at this position. The 
process is repeated until the reagent has been 

10 dispensed at this preselected corresponding position on 
each of the supports. 

Where it is desired to dispense a single reagent 
at more than two array positions on a support, the 
dispenser may be moved to different array positions at 

15 each support, before moving the dispenser to a new 
support, or solution can be dispensed at individual 
positions on each support, at one selected position, 
then the cycle repeated for each new array position. 
To dispense the next reagent, the dispenser is 

20 positioned over a wash solution (not shown) , and the 
dispenser tip is dipped in and out of this solution 
until the reagent solution has been substantially 
washed from the tip. Solution can be removed from the 
tip, after each dipping, by vacuum, compressed air 

25 spray, sponge, or the like. 

The dispenser tip is now dipped in a second 
reagent well, and the filled tip is moved to a second 
selected array position in the first support. The 
process of dispensing reagent at each of the 

30 corresponding second-array positions is then carried as 
above. This process is repeated until an entire 
microarray of reagent solutions on each of the supports 
has been formed. 



35 
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This section describes embodiments of a substrate 
having a microarray of biological polymers carried on 
the substrate surface. Subsection A describes a multi- 
cell substrate, each cell of which contains a 
microarray, and preferably an identical microarray, of 
distinct biopolymers, such as distinct polynucleotides, 
formed on a porous surface. Subsection B describes a 
microarray of distinct polynucleotides bound on a glass 
slide coated with a polycat ionic polymer. 



A. Multi-Cell Substrate 

Fig. 9 illustrates, in plan view, a substrate 110 
constructed according to the invention. The substrate 
has an 8 x 12 rectangular array 112 of cells, such as 
15 cells 114, 116, formed on the substrate surface. With 
reference to Fig. 10, each cell, such as cell 114, in 
turn supports a microarray 118 of distinct biopolymers, 
such as polypeptides or polynucleotides at known, 
addressable regions of the microarray. Two such 
regions forming the microarray are indicated at 120, 
and correspond to regions, such as regions 42, forming 
the microarray of distinct biopolymers shown in Fig. 3. 

The 96-cell array shown in Pig. 9 has typically 
array dimensions between about 12 and 244 mm in width 
25 and 8 and 400 mm in length, with the cells in the array 
having width and length dimension of 1/12 and 1/8 the 
array width and length dimensions, respectively, i.e., 
between about 1 and 20 in width and 1 and 50 mm in 
length. 

30 The construction of substrate is shown cross- 

sectionally in Fig. 11, which is an enlarged sectional 
view taken along view line 124 in Fig. 9. The 
substrate includes a water- impermeable backing 126, 
such as a glass slide or rigid polymer sheet. Formed 

35 on the surface of the backing is a water-permeable film 
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128. The film is formed of a porous membrane material, 
such as nitrocellulose membrane, or a porous web 
material, such as a nylon, polypropylene, or. PVDF 
porous polymer material. The thickness of the film is 
5 preferably between about 10 and 1000 mhu The film may 
be applied to the backing by spraying or coating 
uncured material on the backing, or by applying a 
preformed membrane to the backing. The backing and 
film may be obtained as a preformed unit from 

10 commercial source, e.g., a plastic-backed 

nitrocellulose film available from Schleicher and 
Schuell Corporation. 

With continued reference to Fig. 11, the film- 
covered surface in the substrate is partitioned into a 

15 desired array of cells by water- impermeable grid lines, 
such as lines 130, 132, which have infiltrated the film 
down to the level of the backing, and extend above the 
surface of the film as shown, typically a distance of 
100 to 2000 jxm above the film surface. 

20 The grid lines are formed on the substrate by 

laying down an uncured or otherwise f lowable resin or 
elastomer solution in an array grid, allowing the 
material to infiltrate the porous film down to the 
backing, then curing or otherwise hardening the grid 

25 lines to form the cell-array substrate. 

One preferred material for the grid is a flowable 
silicone available from Loctite Corporation. The 
barrier material can be extruded through a narrow 
syringe (e.g., 22 gauge) using air pressure or 

30 mechanical pressure. The syringe is moved relative to 
the solid support to print the barrier elements as a 
grid pattern. The extruded bead of silicone wicks into 
the pores of the solid support and cures to form a 
shallow waterproof barrier separating the regions of 

35 the solid support. 
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In alternative embodiments , the barrier element 
can be a wax-based material or a thermoset material 
such as epoxy. The barrier material can also be a UV- 
curing polymer which is exposed to UV light after being 
printed onto the solid support. The barrier material 
may also be applied to the solid support using printing 
techniques such as silk-screen printing. The barrier 
material may also be a heat-seal stamping of the porous 
solid support which seals its pores and forms a water- 
impervious barrier element. The barrier material may 
also be a shallow grid which is laminated or otherwise 
adhered to the solid support. 

In addition to plastic-backed nitrocellulose, the 
solid support can be virtually any porous membrane with 
15 or without a non-porous backing. Such membranes are 
readily available from numerous vendors and are made 
from nylon, PVDF, polysulfone and the like. In an 
alternative embodiment, the barrier element may also be 
used to adhere the porous membrane to a non-porous 
20 backing in addition to functioning as a barrier to 
prevent cross contamination of the assay reagents. 

In an alternative embodiment, the solid support 
can be of a non-porous material. The barrier can be 
printed either before or after the microarray of 
25 biomolecules is printed on the solid support. 

As can be appreciated, the cells formed by the 
grid lines and the underlying backing are water- 
impermeable, having side barriers projecting above the 
porous film in the cells. Thus, def ined-volume samples 
30 can be placed in each well without risk of cross- 
contamination with sample material in adjacent cells. 
In Pig. li, defined volumes samples, such as sample 
134, are shown in the cells. 

As noted above, each well contains a microarray of 
35 distinct biopolymers. In one general embodiment, the 
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microarrays in the well are identical arrays of 
distinct biopolymers, e.g., different sequence 
polynucleotides. Such arrays can be formed in 
accordance with the methods described in Section II , by 
5 depositing a first selected polynucleotide at the same 
selected microarray position in each of the cells, then 
depositing a second polynucleotide at a different 
microarray position in each well, and so on until a 
complete, identical microarray is formed in each cell. 

10 in a preferred embodiment, each microarray 

contains about 10 3 distinct polynucleotide or 
polypeptide biopolymers per surface area of less than 
about l cm 3 . Also in a preferred embodiment, the 
biopolymers in each microarray region are present in a 

15 defined amount between about 0.1 femtomoles and 100 

nanomoles. The ability to form high-density arrays of 
biopolymers, where each region is formed of a well- 
defined amount of deposited material, can be achieved 
in accordance with the microarray-f orming method 

20 described in Section II. 

Also in a preferred embodiments , the biopolymers 
are polynucleotides having lengths of at least about 50 
bp, i.e., substantially longer than oligonucleotides 
which can be formed in high-density arrays by schemes 

25 involving parallel, step-wise polymer synthesis on the 
array surface- 

In the case of a polynucleotide array, in an assay 
procedure, a small volume of the labeled DNA probe 
mixture in a standard hybridization solution is loaded 

30 onto each cell. The solution will spread to cover the 
entire microarray and stop at the barrier elements. 
The solid support is then incubated in a humid chamber 
at the appropriate temperature as required by the 
assay. 
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Each assay may be conducted in an u open-face* 
format where no further sealing step is required, since 
the hybridization solution will be kept properly 
hydrated by the water vapor in the humid chamber. At 
the conclusion of the incubation step, the entire solid 
support containing the numerous microarrays is rinsed 
quickly enough to dilute the assay reagents so that no 
significant cross contamination occurs. The entire 
solid support is then reacted with detection reagents 
if needed and analyzed using standard color imetric, 
radioactive or fluorescent detection means. All 
processing and detection steps are performed 
simultaneously to all of the microarrays on the solid 
support ensuring uniform assay conditions for all of 
the microarrays on the solid support. 

B. Glass-Slide Polynucleotide Array 
Fig, 5 shows a substrate 136 formed according to 
another aspect of the invention, and intended for use 
in detecting binding of labeled polynucleotides to one 
or more of a plurality distinct polynucleotides. The 
substrate includes a glass substrate 138 having formed 
on its surface, a coating of a polycationic polymer, 
preferably a cationic polypeptide, such as poly lysine 
or polyarginine. Formed on the polycationic coating is 
a microarray 140 of distinct polynucleotides, each 
localized at known selected array regions, such as 
regions 142. 

The slide is coated by placing a uniform-thickness 
30 film of a polycationic polymer, e.g., poly-l-lysine, on 
the surface of a slide and drying the film to form a 
dried coating. The amount of polycationic polymer 
added is sufficient to form at least a monolayer of 
polymers on the glass surface. The polymer film is 
35 bound to surface via electrostatic binding between 
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negative silyl-OH groups on the surface and charged 
amine groups in the polymers. Poly-l-lysine coated 
glass slides may be obtained commercially, e.g., from 
Sigma Chemical Co. (St. Louis, MO). 
5 To form the microarray, defined volumes of 

distinct polynucleotides are deposited on the polymer- 
coated slide, as described in Section II. According to 
an important feature of the substrate, the deposited 
polynucleotides remain bound to the coated slide 

10 surface non-covalently when an aqueous DNA sample is 
applied to the substrate under conditions which allow 
hybridization of reporter-labeled polynucleotides in 
the sample to complementary-sequence (single-stranded) 
polynucleotides in the substrate array. The method is 

15 illustrated in Examples 1 and 2. 

To illustrate this feature, a substrate of the 
type just described, but having an array of same- 
sequence polynucleotides, was mixed with fluorescent- 
labeled complementary DNA under hybridization 

20 conditions. After washing to remove non-hybridized 
material, the substrate was examined by low-power 
fluorescence microscopy. The array can be visualized 
by the relatively uniform labeling pattern of the array 
regions . 

25 In a preferred embodiment, each microarray 

contains at least 10 3 distinct polynucleotide or 
polypeptide biopolymers per surface area of less than 
about 1 cm 2 . In the embodiment shown in Fig. 5, the 
microarray contains 400 regions in an area of about 16 

30 mm 2 , or 2.5 x 10 3 regions/cm 2 . Also in a preferred 

embodiment, the polynucleotides in the each microarray 
region are present in a defined amount between about 
0.1 femtomoles and 100 nanomoles in the case of 
polynucleotides. As above, the ability to form high- 
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density arrays of this type /where each region is 
formed of a well-defined amount of deposited material, 
can be achieved in accordance with the microarray- 
forming method described in Section II. 

Also in a preferred embodiments, the 
polynucleotides have lengths of at least about 50 bp, 
i.e., substantially longer than oligonucleotides which 
can be formed in high-density arrays by various in situ 
synthesis schemes. 



V. Utility 

Microarrays of immobilized nucleic acid sequences 
prepared in accordance with the invention can be used 
for large scale hybridization assays in numerous 
genetic applications, including genetic and physical 
mapping of genomes, monitoring of gene expression, DNA 
sequencing, genetic diagnosis, genotyping of organisms, 
and distribution of DNA reagents to researchers. 

Por gene mapping, a gene or a cloned DNA fragment 
is hybridized to an ordered array of DNA fragments, and 
the identity of the DNA elements applied to the array 
is unambiguously established by the pixel or pattern of 
pixels of the array that are detected, one application 
of such arrays for creating a genetic map is described 
25 by Nelson, et al. (1993). in constructing physical 
naps of the genome, arrays of immobilized cloned DNA 
fragments are hybridized with other cloned DNA 
fragments to establish whether the cloned fragments in 
the probe mixture overlap and are therefore contiguous 
to the immobilized clones on the array. For example, 
Lehrach, et al., describe such a process. 

The arrays of immobilized DNA fragments may also 
be used for genetic diagnostics. To illustrate, an 
array containing multiple forms of a mutated gene or 
35 genes can be probed with a labeled mixture of a 
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patient's DNA which will preferentially interact with 
only one of the immobilized versions of the gene. 

The detection of this interaction can lead to a 
medical diagnosis. Arrays of immobilized ONA fragments 
5 can also be used in DNA probe diagnostics. For 

example, the identity of a pathogenic microorganism can 
be established unambiguously by hybridizing a sample of 
the unknown pathogen's DNA to an array containing many 
types of known pathogenic DNA. A similar technique can 

10 also be used for ^unambiguous genotyping of any 

organism. Other molecules of genetic interest, such as 
cDNA's and RNA's can be immobilized on the array or 
alternately used as the labeled probe mixture that is 
applied to the array. 

15 In one application, an array of cDNA clones 

representing genes is hybridized with total cDNA from 
an organism to monitor gene expression for research or 
diagnostic purposes. Labeling total cDNA from a normal 
cell with one color f luorophore and total cDNA from a 

20 diseased cell with another color f luorophore and 

simultaneously hybridizing the two cDNA samples to the 
same array of cDNA clones allows for differential gene 
expression to be measured as the ratio of the two 
f luorophore intensities* This two-color experiment can 

25 be used to monitor gene expression in different tissue 
types, disease states, response to drugs , or response 
to environmental factors. & An example of this approach 
is illustrated in Examples 2, described with respect to 
Fig. 8. 

30 By way of example and without implying a 

limitation of scope, such a procedure could be used to 
simultaneously screen many patients against all known 
mutations in a disease gene. This invention could be 
used in the form of, for example, 96 identical 0.9 cm x 

35 2.2 cm microarrays fabricated on a single 12 cm x 18 cm 
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sheet of plastic-backed nitrocellulose where each 
microarray could contain, for example, 100 DNA 
fragments representing all known mutations of a given 
gene. The region of interest from each of the DNA 
5 samples from 96 patients could be amplified, labeled, 
and hybridized to the 96 individual arrays with each 
assay performed in 100 microliters of hybridization 
solution. The approximately l thick silicone rubber 
barrier elements between individual arrays prevent 

10 cross contamination of the patient samples by sealing 
the pores of the nitrocellulose and by acting as a 
physical barrier between each microarray. The solid 
support containing all 96 microarrays assayed with the 
96 patient samples is incubated, rinsed, detected and 

15 analyzed as a single sheet of material using standard 
radioactive, fluorescent, or color imetric detection 
means (Maniatas, et al., 1989). Previously, such a 
procedure would involve the handling, processing and 
tracking of 96 separate membranes in 96 separate sealed 

20 chambers. By processing all 96 arrays as a single 

sheet of material, significant time and cost savings 
are possible. 

The assay format can be reversed where the patient 
or organism's DNA is immobilized as the array elements 

25 and each array is hybridized with a different mutated 
allele or genetic marker. The gridded solid support 
can also be used for parallel non-DNA ELISA assays. 
Furthermore, the invention allows for the use of all 
standard detection methods without the need to remove 

30 the shallow barrier elements to carry out the detection 
step. 

In addition to the genetic applications listed 
above, arrays of whole cells, peptides, enzymes, 
antibodies, antigens, receptors, ligands, 
35 phospholipids, polymers, drug cogener preparations or 
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chemical substances can be fabricated by the means 
described in this invention for large scale screening 
assays in medical diagnostics, drug discovery, 
molecular biology, immunology and toxicology. 
5 The multi-cell substrate aspect of the invention 

allows for the rapid and convenient screening of many 
DNA probes against many ordered arrays of DNA 
fragments. This eliminates the need to handle and 
detect many individual arrays for performing mass 
10 screenings for genetic research and diagnostic 

applications. Numerous microarrays can be fabricated 
on the same solid support and each microarray reacted 
with a different DNA probe while the solid support is 
processed as a single sheet of material. 

15 

The following examples illustrate, but in no way 
Bore intended to limit, the present invention. 

Example 1 

20 Genomic-Complexitv Hybridization to Micro 

DNA Arrays Representing the Yeast 
Saccharomvces cerevisiae Genome with 
Two^Color Fluorescent Detection 

The array elements were randomly amplified PGR 

25 (Boh lander, et al., 1992) products using physically 

mapped lambda clones of 5. carevisiam genomic DNA 

templates (Riles, et al., 1993}. The PCR was performed 

directly on the lambda phage lysates resulting in an 

amplification of both the 35 kb lambda vector and the 

30 5-15 kb yeast insert sequences in the form of a uniform 

distribution of PCR product between 250-1500 base pairs 

in length. The PCR product was purified using 

Sephadex G50 gel filtration (Pharmacia, Piscataway, NJ) 

and concentrated by evaporation to dryness at room 

35 temperature overnight. Each of the 864 amplified 
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lambda clones was rehydrated in 15 pi of 3 x SSC in 
preparation for spotting onto the glass. 

The micro arrays were fabricated on microscope 
slides which were coated with a layer of poly-l-lysine 
5 (Sigma) . The automated apparatus described in Section 
IV loaded 1 /il of the concentrated lambda clone PGR 
product in 3 x SSC directly from 96 well storage plates 
into the open capillary printing element and deposited 
-5 nl of sample per slide at 380 micron spacing between 

10 spots, on each of 40 slides* The process was repeated 
for all 864 samples and 8 control spots. After the 
spotting operation was complete, the slides were 
rehydrated in a humid chamber for 2 hours, baked in a 
dry 80° vacuum oven for 2 hours, rinsed to remove un- 

15 absorbed DNA and then treated with succinic anhydride 
to reduce non-specific adsorption of the labeled 
hybridization probe to the poly-l-lysine coated glass 
surface. Immediately prior to use, the immobilized DNA 
on the array was denatured in distilled water at 90° 

20 for 2 minutes. r 

For the pooled chromosome experiment, the 16 
chromosomes of Saccharomyces cerevisiae were separated 
in a CHEF agarose gel apparatus (Biorad, Richmond, CA) . 
The six largest chromosomes were isolated in one gel 

25 slice and the smallest 10 chromosomes in a second gel 
slice. The DNA was recovered using a gel extraction 
kit (Qiagen, Chatsworth, CA) . The two chromosome pools 
were randomly amplified in a manner similar to that 
used for the target lambda clones. Following 

30 amplification, 5 micrograms of each of the amplified 

chromosome pools were separately random-primer labeled 
using Klenow polymerase (Amersham, Arlington Heights, 
IL) with a lissamine conjugated nucleotide analog 
(Dupont NEN, Boston, MA) for the pool containing the 

35 six largest chromosomes, and with a fluorescein 
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conjugated nucleotide analog (BMB) for the pool 
containing smallest ten chromosomes. The two pools 
were mixed and concentrated using an ultrafiltration 
device (Amicon, Danvers, MA). 
5 Five micrograms of the hybridization probe 

consisting of both chromosome pools in 7.5 /il of TE was 
denatured in a boiling water bath and then snap cooled 
on ice. 2.5 jil of concentrated hybridization solution 
(5 x SSC and 0.1% SDS) was added and all 10 fil 

10 transferred to the array surface, covered with a cover 
slip, placed in a custom-built single-slide humidity 
chamber and incubated at 60° for 12 hours. The slides 
were then rinsed at room temperature in 0.1 x SSC and 
0.1%SDS for 5 minutes, cover slipped and scanned. 

15 a custom built laser fluorescent scanner was used 

to detect the two-color hybridization signals from the 
1.8 x 1.8 cm array at 20 micron resolution. The 
scanned image was gridded and analyzed using custom 
image analysis software. After correcting for optical 

20 crosstalk between the f luorophores due to their 
overlapping emission spectra, the red and green 
hybridization values for each clone on the array were 
correlated to the known physical map position of the 
clone resulting in a computer-generated color karyotype 

25 of the yeast genome. 

Figure 6 shows the hybridization pattern of the 
two chromosome pools. A red signal indicates that the 
lambda clone on the array surface contains a cloned 
genomic DNA segment from one of the largest six yeast 

30 chromosomes. A green signal indicates that the lambda 
clone insert comes from one of the smallest ten yeast 
chromosomes. Orange signals indicate repetitive 
sequences which cross hybridized to both chromosome 
pools. Control spots on the array confirm that the 

35 hybridization is specific and reproducible. 
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The physical map locations of the genomic DMA 
fragments contained in each of the clones used as array 
elements have been previously determined by Olson and 
co-workers (Riles, et al.) allowing for the automatic 
5 generation of the color karyotype shown in Figure 1. 
The color of a chromosomal section on the karyotype 
corresponds to the color of the array element 
containing the clone from that section. The black 
regions of the karyotype represent false negative dark 

10 spots on the array (10%) or regions of the genome not 
covered by the Olson clone library (90%) . Note that 
the largest six chromosomes are mainly red while the 
smallest ten chromosomes are mainly green matching the 
original CHEF gel isolation of the hybridization probe. 

15 Areas of the red chromosomes containing green spots and 
vice-versa are probably due to spurious sample tracking 
errors in the formation of the original library and in 
the amplification and spotting procedures. 

The yeast genome arrays have also been probed with 

20 individual clones or pools of clones that are 

fluorescently labeled for physical mapping purposes. 
The hybridization signals of these clones to the array 
were translated into a position on the physical map of 
yeast. 

25 

Example 2 

Total cDNA H ybridized to Micro Arrays of 
cDNA Clones with Two-color 
Fluorescent Detection 

30 24 clones containing cDNA inserts from the plant 

Arabidopsis were amplified using PCR. Salt was added 
to the purified PCR products to a final concentration 
of 3 x SSC. The cDNA clones were spotted on poly-1- 
lysine coated microscope slides in a manner similar to 

35 Example l. Among the cDNA clones was a clone 
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representing a transcription factor HAT 4, which had 
previously been used to create a transgenic line of the 
plant Arabidopsis , in which this gene is present at ten 
times the level found in wild-type Arabidopsis (Schena, 
5 et al., 1992). 

Total poly-A mRNA from wild type Arabidopsis was 
isolated using standard methods (Maniatis, et al., 
1989) and reverse transcribed into total cDNA, using 
fluorescein nucleotide analog to label the cDNA product 

10 (green fluorescence) . A similar procedure was 

performed with the transgenic line of Arabidopsis where 
the transcription factor HAT4 was inserted into the 
genome using standard gene transfer protocols. cDNA 
copies of mRNA from the transgenic plant are labeled 

15 with a lissamine nucleotide analog (red fluorescence) . 
Two micrograms of the cDNA products from each type of 
plant were pooled together and hybridized to the cDNA 
clone array in a 10 microliter hybridization reaction 
in a manner similar to Example 1. Rinsing and 

20 detection of hybridization was also performed in a 

manner similar to Example l. Fig. 8 show the resulting 
hybridization pattern of the array. 

Genes equally expressed in wild type and the 
transgenic Arabidopsis appeared yellow due to equal 

25 contributions of the green and red fluorescence to the 
final signal. The dots are different intensities of 
yellow indicating various levels of gene expression. 
The cDNA clone representing the transcription factor 
HAT4, expressed in the transgenic line of Arabidopsis 

30 but not detectably expressed in wild type Arabidopsis, 
appears as a red dot (with the arrow pointing to it) , 
indicating the preferential expression of the 
transcription factor in the red-labeled transgenic 
Arabidopsis and the relative lack of expression of the 
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transcription factor in the green-labeled wild type 
Arabidopsis . 

An advantage of the microarray hybridization 
format for gene expression studies is the high partial 
concentration of each cDNA species achievable in the 10 
microliter hybridization reaction. This high partial 
concentration allows for detection of rare transcripts 
without the need for PCR amplification of the 
hybridization probe which may bias the true genetic 
representation of each discrete cDNA species. 

Gene expression studies such as these can be used 
for genomics research to discover which genes are 
expressed in which cell types, disease states, 
development states or environmental conditions. Gene 
expression studies can also be used for diagnosis of 
disease by empirically correlating gene expression 
patterns to disease states. 

Example 3 

Multiplexed Colorimetric Hybridization on 
a Gridde d Solid Support: 

A sheet of plastic-backed nitrocellulose was 

gridded with barrier elements made from silicone rubber 

according to the description in Section IV-A. The 

sheet was soaked in 10 x SSC and allowed to dry. As 

shown in Fig. 12, 192 M13 clones each with a different 

yeast inserts were arrayed 4 00 microns apart in four 

quadrants of the solid support using the automated 

device described in Section III. The bottom left 

30 quadrant served as a negative control for hybridization 

while each of the other three quadrants was hybridized 

simultaneously with a different oligonucleotide using 

the open-face hybridization technology described in 

Section IV-A. The first two and last four elements of 



10 



WO 9505305 



PCT/US95/07659 



38 

each array are positive controls for the coiorimetric 
detection step. 

The oligonucleotides were labeled with fluorescein 
which was detected using an anti-f luorescein antibody 
5 conjugated to alkaline phosphatase that precipitated an 
NBT/BCIP dye on the solid support (Amersham) . Perfect 
matches between the labeled oligos and the M13 clones 
resulted in dark spots visible to the naked eye and 
detected using an optical scanner (HP ScanJet II) 

10 attached to a personal computer. The hybridization 
patterns are different in every quadrant indicating 
that each oligo found several unique M13 clones from 
among the 192 with a perfect sequence match. Note that 
the open capillary printing tip leaves detectable 

15 dimples on the nitrocellulose which can be used to 
automatically align and analyze the images. 

Although the invention has been described with 
respect to specific embodiments and methods, it will be 
20 clear that various changes and modification may be made 
without departing from the invention. 
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IT IS CLAIMED : 

1. A method of forming a microarray of analyte- 
assay regions on a solid support, where each region in 
5 the array has a known amount of a selected, analyte- 
specific reagent, said method comprising, • 

(a) loading a solution of a selected analyte- 
specif ic reagent in a reagent-dispensing device having 
an elongate capillary channel (i) formed by spaced- 

10 apart, coextensive elongate members, (ii) adapted to 
hold a quantity of the reagent solution and (iii) 
having a tip region at which aqueous solution in the 
channel forms a meniscus, 

(b) tapping the tip of the dispensing device 

15 against a solid support at a defined position on the 
surface, with an impulse effective to break the 
meniscus in the capillary channel and deposit a 
selected volume of solution on the surface, and 

(c) repeating steps (a) and (b) until said array 
20 is formed. 

2. The method of claim l, wherein said tapping is 
carried out with an impulse effective to deposit a 
selected volume in the volume range between 0.01 to 100 

25 nl. 

3. The method of claim 1, wherein said channel is 
formed by a pair of spaced-apart tapered elements. 

30 4 - The method of claim l, for forming a plurality 

of such arrays, wherein step (b) is applied to a 
selected position on each of a plurality of solid 
supports at each repeat cycle proceeding step (c) . 
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5, The method of claim l, which further includes, 
after performing steps (a) and (b) at least one time, 
reloading the reagent-dispensing device with a new 
reagent solution by the steps of (i) dipping the 
5 capillary channel of the device in a wash solution, 
(ii) removing wash solution drawn into the capillary 
channel, and (iii) dipping the capillary channel into 
the new reagent solution. 

10 6. Automated apparatus for forming a microarray 

of analyte-assay regions on a plurality of solid 
supports, where each region in the array has a known 
amount of a selected, analyte-specif ic reagent, said 
apparatus comprising 

15 (a) a holder for holding, at known positions, a 

plurality of planar supports, 

(b) a reagent dispensing device having an open 
capillary channel (i) formed by spaced-apart, 
coextensive elongate members (ii) adapted to hold a 

20 quantity of the reagent solution and (iii) having a tip 
region at which aqueous solution in the channel forms a 
meniscus, 

(c) positioning means for positioning the 
dispensing device at a selected array position with 

25 respect to a support in said holder, 

(d) dispensing means for moving the device into 
tapping engagement against a support with a selected 
impulse, when the device is positioned at a defined 
array position with respect to that support, with an 

30 impulse effective to break the meniscus of liquid in 

the capillary channel and deposit a selected volume of 
solution on the surface, and 

(e) control means for controlling said positioning 
and dispensing means. 



35 



WO3SO5505 



PCT/OS95/07659 



41 

7. The apparatus of claim 6, wherein said 
dispensing means is effective to move said dispensing 
device against a support with an impulse effective to 
deposit a selected volume in the volume range between 

5 0.01 to 100 nl. 

8. The apparatus of claim 6, wherein said channel 
is formed by a pair of spaced-apart tapered elements. 

9. The apparatus of claim 6, wherein the control 
means operates to (i) place the dispensing device at a 
loading station, (ii) move the capillary channel in the 
device into a selected reagent at the loading station, 
to load the dispensing device with the reagent, and 
(iii) dispense the reagent at a defined array position 
on each of the supports on said holder. 

10. The apparatus of claim 6, wherein the control 
device further operates, at the end of a dispensing 
cycle, to wash the dispensing device by (i) placing the 
dispensing device at a washing station, (ii) moving the 
capillary channel in the device into a wash fluid, to 
load the dispensing device with the fluid, and (iii) 
remove the wash fluid prior to loading the dispensing 

25 device with a fresh selected reagent. 

11. The apparatus of claim 6, wherein said device 
is one of a plurality of such devices which are carried 
on the arm for dispensing different analyte assay 

30 reagents at selected spaced array positions. 

12. A substrate with a surface having a 
microarray of at least 10 3 distinct polynucleotide or 
polypeptide biopolymers per 1 cm 2 surface area, each 
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distinct biopolymer sample (i) being disposed at a 
separate, defined position in said array, (ii) having a 
length of at least 50 subunits, and (iii) being present 
in a defined amount between about 0.1 femtomole and 100 
5 nanomoles. 

13. The substrate of claim 12, wherein said 
surface is glass slide coated with polylysine, and said 
biopolymers are polynucleotides. 

14. The substrate of claim 12, wherein said 
substrate has a water- impermeable backing, a water- 
permeable film formed on the backing, and a grid formed 
on the film, where said grid (i) is composed of 
intersecting water-impervious grid elements extending 
from said backing to positions raised above the surface 
of said film, and (ii) partitions the film into a 
plurality of water-impervious cells, where each cell 
contains such a biopolymer array. 

15. A substrate with a surface array of sample- 
receiving cells, comprising 

a water- impermeable backing, 

a water-permeable film formed on the backing, and 
a grid formed on the film, said grid being composed of 
intersecting water-impervious grid elements extending 
from said backing to positions raised above the surface 
of said film. 

16. The substrate of claim 15, wherein the cells 
of the array each contain an array of biopolymers. 

17. A substrate for use in detecting binding of 
labeled biopolymers to one or more of a plurality 

35 distinct polynucleotides, comprising 
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a non-porous, glass substrate, 

a coating of a cationic polymer on said substrate, 

and 

an array of distinct polynucleotides to said 
5 coating, where each biopolymer is disposed at a 
separate, defined position in a surface array of 
biopolymers . 

18. A method of detecting differential expression 

10 of each of a plurality of genes in a first cell type 
with respect to expression of the same genes in a 
second cell types , said method comprising 

producing fluorescence-labeled cDNA's from mRNA's 
isolated from the two cells types, where the cDNA's 

15 from the first and second cells are labeled with first 
and second different fluorescent reporters, 

adding a mixture of the labeled cDNA's from the 
two cell types to an array of polynucleotides 
representing a plurality of known genes derived from 

20 the two cell types, under conditions that result in 

hybridization of the cDNA's to complementary-sequence 
polynucleotides in the array; and 

examining the array by fluorescence under 
fluorescence excitation conditions in which (i) 

25 polynucleotides in the array that are hybridized 

predominantly to cDNA's derived from one of the first 
and second cell types give a distinct first or second 
fluorescence emission color, respectively, and (ii) 
polynucleotides in the array that are hybridized to 

30 substantially equal numbers of cDNA's derived from the 
first and second cell types give a distinct combined 
fluorescence emission color, respectively, 

wherein the relative expression of known genes in 
the two cell types can be determined by the observed 

35 fluorescence emission color of each spot. 
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19. The method of claim 18, wherein the array of 
polynucleotides is formed on a substrate with a surface 
having an array of at least 10 J distinct polynucleotide 
or polypeptide biopolymers in a surface area of less 
than about l cm 2 , each distinct biopolymer (i) being 
disposed at a separate, defined position in said array, 
(ii) having a length of at least 50 subunits, and (iii) 
being present in a defined amount between about .1 
femtomole and 100 nmoies. 

20. The method of claim 19, wherein said surface 
is a glass slide coated with poly lysine, and said 
biopolymers are polynucleotides non-covalently bound to 
said poly lysine. 
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METHODS FOR FABRICATING porous membrane- One amy includes pins that are designed 

MICROARRAYS OF BIOLOGICAL SAMPLES "> spot i membrane in a niggard fasbicc, for creating to 

amy of 9216 spots is • 22x22 as area (Lchracb, el aL. 
CROSS-REFERENCE TO RELATED 1990). A femtauoo with this appraacb is that the vohusc of 

APPLICATION 5 DNAspoued is each pixel of each amy is highly variable. 

In addition, the number of arrays thai can be made with each 
This application is a oosusuauoo*io-pan of VS. patent dipping a usually quite small, 
application Ser. No. 06/261,388, filed Jun. 17, 1994, and tjtemate method of creating ordered arrays of nucleic 

oow abandoned. acid sequence* is described by Pinunt et al. (1992), and 

The United States government may have certain rigbts in JC *1» by Fodor, ct al. (1991). The method involves syntbe- 
the present iovcotiun pursuant lu Gram Nu. HG0O450 sizing different nucleic acid sequences at different discrete 
awarded by the Nadooal Institutes of Health. region* uf a support This method employ* elaborate ky o. 

tbetic schemes, and is generally limited to relatively short 
FIELD OF THE INVENTION nucleic acid sample, e^, less than 20 bases. A related 

method has been described by Southern, el al. (1992). 
This invention relates to a method and apparatus for * Khrapkc, et al. (1991) describe* a method of making an 
fabricating microarray* of biological samples for large scale oligODucleotide matrix by spotting DNAonto a thin laver of 
screening assays, such as arrays of DNA sample to be used polyacrylamide. The spotting is done manually with , 
is DNA hybridization assays for genetic research and diag* micropipeue. 7 
nustic applioniuns. ^ None of the methods or devices ccscribed in the prior an 

nPTPPFKrpc m designed for mass fabricauoo of micro arrays character * 

ized by (0 a large number of micro-sized assay regions 
Abouzied, et aL, Journal of AOAC International 77(2) separated by a distance of 50-200 micron* or less, and (ii) 
:4 95-500 (1994). s well-defined amount, typically in the picomole range, of 

Boniander, ct al., Genomics 13:1322-1324 (1992). as ****** ****** each repoo of the array. 
Drmanac, et aL, 5d«ce 260:1649-1652 (1993). Furthermore, current technology is directed at performing 

FCo, « f =767.773 0991). 1^7^^^ 

Khrapko, et aL, DNA Sequence 1375-388 (1991). in g DNA hybridizations to arrays spotted onto porous mem- 

Kuriyama, et al„ AN ISFET BIOSENSOR, APPLIED K brane involves sealing the membrane in a plastic bag 
BIOSENSORS (Donald Wise, Ed.), Butierwonhs, pp. " (Mania us, et al, 1989) or a relating glass cylinder (Robbins 
93-114 (1989). Scientific) with the labeled hybridization probe inside the 

Ubrich, et aL. HYBRIDIZATION FINGERPRINTING IN sealed chamber. For arrays made on noo-porous surfaces, 
GENOME MAPPING AND SEQUENCING, GENOME ** * micrmaripe slide, each array is inculiated with the 

ANALYSIS, VOL 1 (Davies and TUgham, Eds.), Cold Spring 35 kbclcd hybridization probe scaled under a covcrslip. These 
Harbor Press, pp. 39-41 (1990). techniques require a separate sealed chamber for each array 

Maniatis, et al., MOLECULAR CLONING, A UBORA- olkc$ ue aod oaodiifl * <* °«y 

TORY MANUAL, Cold Spring Harbor Press (1989). my * mcoovcmcDl *** umc wteoave. 

n*i«« m m\ Nam* G*r*tia 4-n.ifi Abouzied, el al. (1994) describes a method of printing 

Nelson, et al Now* Genencs 4 11-18 (1993). ^ ^ q( qd § mlfOCcUuiase 

Pimwg, et aL US. PaL No. 5.143-854 (1992). tod j^p^^g regicmt of ue memo^ne with venical stripes 

Riles* ct ai„ Geneaa 134:81-150 (1993). of a bydrophobic material. Each vertical stripe is then 

Scbeaa, M. et al., Proe. Nat. Acad. Sci. USA reacted with a different antigen and the reaction between the 

89:3894-3898 (1992). immobilized antibody and an antigen is detected using a 

Southern, et aL, Genomics 13:1008-1017 (1992). standard ELISA calorimetrii; technique. Abound!'* tech- 
nique makes it possible to screen many one -dimensional 

BACKGROUND OF THE INVENTION arrays simultaneously on a single sheet of nitrocellulose. 

Abouzied makes the nitrocellulose somewhat hydrophobic 

A variety of methods are currently available for making using a line drawn with PAP Pen (Research Products 

arrays of biological macromolecuks, such as arrays of jC international). However, Abouzied does not describe a tech- 

nucleic acid molecules or proteins. One method for making oology that is capable of completely sealing the pores of the 

ordered arrays of ON Aod a porous membrane b a "dot blot" nitrocellulose. The poresof the nitrocellulose are will phv*i- 

approacb. In this method, a vacuum manifold transfers a calhy open and so (he assay reagents can leak through 'the 

plurality, eg., 96, aqueous samples of DNA from 3 milli- hydrophobic barrier during extended high temperature incu- 

meter diameter wells to a porous membrane. A common j5 nations or io the presence of detergents, which makes toe 

variant of this procedure is a "slot-blot" method in which the Abouzied technique unacceptable for DNA bybridizauoo 

wells have highly-elongated oval shapes. assays. 

The DNA is maxnobtiized on the porous membrane by Porous membranes with printed patterns of bvdrophilic/ 

baking the membrane ur exrwing it lu UV radiabun. This is bydrophobic regions exist for applications such as ordered 

a manual procedure practical for making one array at a time arrays of bacteria colonies. OA Life Sciences (San Diego 

and usually limited to 96 samples per array. "Dot-blot" Calif.) makes such a membraoe with a grid pattern printed 

procedures art therefore inadequate for applications in on it However, this membrane has the same disadvantage as 

which many thousand samples must be deter mined the Ahouried technique since reagent* can will flow between 

A more efficient technique employed for making ordered the gridded arrays making them unusable for separate DNA 

arrays of genomic fragments uses an array of pins dipped 63 hybridization assays. 

into the wells, e.g., the 96 wells of a microiitre plate, for PaD Corporation make a 96-well plate with a porous filter 

transferring an array of samples to a substrate, such as a heat sealed to the bottom of the plate. These plates are 
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capable of containing different reagents in each weD wilbtmt pbce tbc dspeoang device at a loadinz nation, fift nm the 

cmiKonummiuoa. However, cad) well is intended to bold capillary channel in the devjee ^7Lk^rl«^r,t the 

£ m^l C ^yTc^v^^J , ^ «*■»• "> *< dipenains devSwiU. tbe 

here makes i XDJooimy ox many Diomoimucs io each m*«mt >nH /ija tk~ _ . . 

subdivided region of the solid sup^n. Furthermore, tbe 96 , "LSS^l ^L **** my 

device for many cakvimeuic, flunre«« tod radioactive S^K^ £ ^ f * dtt £ a ?°* 10 w,sb ,be 

detection formats wtucto require tnat tbc membrane lie flat «H*osog o^ncc by (i) placing the dispensing device at a 

against ibe detection surface. The invention described bete T"*™? w moving tbe capillary cbaooel io toe 

requires do timber processing after tbe assay step since tbe T^* 10 * WI * h flujd - «o load tbe dispensing device wxtb 
b&rhen elements an sbailow and do oot interfere witb tbe IC ™* J" 4 ™ < m > removxa « uc w «a fluid prior to loading 

dciecuoo tucp, ibereby greatly ^^Hn^ uiovcaieauB. ™* "P^ 04 * 0 * device witb a fresh selected reagent 

Hytcq Curpuntiuo bss described a mclbud of making an J** dispcosin S **** » 106 apparatus may be one of a 

••amy ofamys" on a non-porous solid support for use wiib P™ nh V of sucb devices wbicb are earned on tbe arm for 

their sequencing by hybridization technique. Tbe method «*PO»»g different anaiyie assay reagents at selected 

descnocd by Hyseq involves oiodirying tbe cbemistry of tbe spaced amy positions. 

soiid support material to form a hydrophobic grid panern *° uocber aspect, tbc invention includes a substrate with 

wbere eacb subdivided region contains a microarray of 1 snf ^ lot having a micro amy of at least 10 s distinct poly- 

biomolecules. Hyseq 's flat hydrophobic panern docs not nucleotide or polypeptide biopolymers in a surface area of 

make uv of physical blocking a* an additional mean* of _ ■buul 1 on". Each distinct biopolymer (i) is 

preventing cross contamination. disposed at a separate, defined position in said array, (ii) bas 

SUMMARY OF THE INVENTION Lffll^f^nw (fii) * l ™ 1 m 4 

oenoed amount between about 0.1 fern to moles and 100 

The invention includes, in one aspect, a metbod of form- nanomok*. 

ing a micmamy of analytcaaaay *gio«on a solid support. In one embodiment, the surface » gl*, Hide surface 

where eacb region u. ibe may has a knows amount of a » coated with . polycanonic polymer. such tt poN^^oo 

ae^an.lyw^cre.gem.Tlenx^aovoh^to ,be biopolymer. are polynucleotides lV\To,b7r 

loading .solution of a selected anilyu -specific reagent m . embodiment. Ibe ..biriu b» a w.^r-impermeaWe 

reageot^pensing device b. vmg an elongate capillary backing. . water-permeablc film formed on tbe biddng «od 

channel (0 fanned by •P»«cU. P an. coexieusm elongate . grid fanned on tbe film. The grid is co Bp ose7of»te7 

number*, (u) .dipted to bold a t,u.nuty of tbe reagent * «cung wuer-impervious grid ekment* e«er^fr om *a^ 

so uuon and^uO having a tip regno at wtucb aqueous backing to pwiiions raiaed above ibe surface „id £n 

soluoon » tbe ctannel forms a meiuacus. Tbe channel is paniiions tbe film into a plurelity of w.ur ^erviS 

preferebr}- formed by a pau of spacecVapan upered ele- cells. A bioporymer amy is formed within eacb*TlL 

_ . ... .._ . . . « More pneraLI>', there is provided a cubavatc for uae m 

Tbe np of tbt dispcnsmg dev,ce * Upped .gains! a aoUd deiecung binding oflabeled polynucleotides to one or^o" 

support at a defined posttioo on tbe support nurtace «ith an of a plunliry differeni-sequ^nce. immobilued 

™P uh ! ' S ?r? bre ? th l 1 " aucut f i0 ^ «Pa>«y oudea. The subsmte 1D =lu7ea, in one as^^S± 

channel, and dcpoa.t a aelecud volume of Mhum on the . coating of a polycationic polymer. lucbts p^vainToo 

^ SUPl m myi * DUC >">"des elearost.tic.lly bound noo^cnUy^o^aid 

lormed. coating, where eacb diaina biopolymer is disposed at a 

The method may be pracuod u fumang a pluraUly of sepv.ic, Oelioetl posiiuo in . uirf.ee .my ofpolynucle- 

such amys, wbere the aoluuoo-dcposiung step is applied 10 o tides. ' 

° D ° f ' 01 *' « 10 »*" "P e «- lhe ^suate includes a w.,e,. 

rarepeucyae impermeable backing. • w.ier-permeable film formed on Ibe 

The dopensmg dev« may be loaded with a new solution. bademe „d , grid formed on tbe film, where tbe md is 

by tbe steps of (i)drppmg tbe capillary channel of the device competed of intersecting w,ter.impervinw grid element 

■n . wash soiuoon. (ri) removmg wash solnuon drawn into eaending from the backing to positions raised above the 

tbe capillary channel, and (iii) dipping the capillary channel x surface of tbe film, forming a plurality of cells. A biopolymer 

into the new reagent anhiunn. an»v is formed witbio each ceU: °P»'y=er 

Aho included in the invention is ,n automated appanuu* Also forming pan of tbe inveouoo is a metbod of detect- 

for fonmng a nucrouny of ^Jjrte^say regions on a i„g differential eapre«ion of each of . plurality of genet" 

plurabty of solid supports, wbere each region u , «e amy . first c*U type, with respect u, expression oflbe saJe «^ 

has a known amount of a analyne^eofic reagenL „ tt . second cell type. In practicu> E tbe metbod. there U firs! 

Tb^ apparatus has a bolder for bokfag, at known positions. produced n-oroaxouLbclol cDNAs from mRNAs i^bted 

IS^^ ^L^H^ * ret8,al dape0ME £ro» tbe rwo cells types, where tbe cDNAs from the fin, sod 

device of tbe type described above. second oeD types ue labeled with first and second different 

Tbe apparatus runner includes a positioning structure for fluorescent reporters, 

positioning tbe dispensing device at a selected amy posiuon *. a mixture of ibe labeled cDNAs from the two ceU ivoes 

with respect ,0 . support m «d bolder, and a dispenstng i, .dded to an array of polynucleotides represcaunH 

structure for oovrng «he daapensing dewe »to tapping phiraUty of known genes denved from tbe rwo cell rypes 

eng^ment «g.,n« a »ppon w.th a selected .mpulae effec under conditions tb.t result in bybridizaiioo of tbe cDNAsio 

nvc to deposit a selected volume on the support, eg, . coarplcmenlary -sequence polynucleotides in tbe array Tbe 

selected volume m the volume range 0.01 to 100 nl « amy » tbca «TnVined b/nuorcsccnce un^ll»o^cxn« 

The posiuooing and dispensing structures are controlled excitation conditions in wbicb (0 polynucleotides io tbe 

by a control unit in tbe apparatus. Tne unit operates to (0 amy thai are hybridized prerfomioandy to cDNAs derived 
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from ooe of the first or second cell types give a distinct fist DnTAlLHD DHSCKIKIIUN OK THE 

or secood fluorescence emission color, respectively, tod (ii) INVENTION 

polynucleotides in tot amy that art hybridized to aibsao* . |y fl ,; AltK 

Dally equal Dumber* of cDNAs derived from the first and , , 

second cell types give a distinct ^mfrrrrl fluorescence s Uoku indicated otherwise, ibt terms defined bebw bive 

emission color, respectively. The relieve expression of ^Bowing meanings: 

known genes is the two cell types can then be determined by . "ligand" refer* to one member of a ligand/antMigand 

the observed fluorescence emission color of each spot. binding pair. The ligtnd may be, for example, ooe of the 

Tbtsc and other objects and features of the ioveotion will *** » 1 complementary, hybridized nucleic 

become more fullv apparent wbec the following detailed » *™ duplex binding ,pair, an effector molecule in an effector/ 

dcKcripiicm of the invention ts read in conjunction with the receptor binding pair, or an antigen in an antigen/antibody or 

accompanying figures. anugen'antibody fragment binding pair. 

Tbe file of this patent contains at least ooe erawing -^U-Lgind- refers totbc opposite member of a liganoV 

executed in color. Copies of this patent with color drawing ***** pair.Toe and-Kgand may be the other of 

(s) will be provided by tbe Patent and Trademark Office 33 lhc nux ^ ett£ ,cuJ ***** m a ummkmcnury, hyhridi/xd 

upon request and payment of the necessary fee. Duckic ldd du P kx binding pair, the receptor molecule in an 

effecta/receptor binding pair, or an antibody or antibody 

BRJEJ- DkSCWnON OK *lKh DRAWINGS fragment molecule in amigen/antibody or antigen/antibody 

FIG. 1 is a side view of a rtageot-Aspeosing device *f~* ^ ~Wely. 

hiving a opeo-capillary dispensing bead constructed for use * f*?** 01 ""wr* molecule" refers to a molecule, 

in one embodiment of tbe invention; rypicaUy a maeromolecule, such as a polynucleotide or 

FICS. 2A-2C illustrate step* in the delivery of a fixed- P^SJW* 6 * presence, amount, and/or identity are to 

volume bead on a hydrophobic surface employing the da* £^ cr7nmcd ™ e lfU, - vle * ,mc mcmbcr " r • l*and/«ui. 

pensing bead from FIG. 1, in accordance with one embodi- ^* U ' 

mem of the method of tbe invention; 23 "Analyte-speafic assay reagent" refers to a mokcule 

FIG. 3 snows a portion of a iwcMhinensional array of J**™ 10 *** *T*cincaUy to an aualyte molecule. *ibe 

analyte-assay regions constructed according to tbe method JUgf* "i ° * bgancV.mi.ligaod 

of the invention; 1 ^ 

FIG. 4 is » planar view ibowiog coapoocms of u x J^l*?.^^ 1 " f OB '."^J ""J*™ 1 * *»' 

. ^ui^«^L^rw^Sr^^mv^h» « ,bout 100G ™"- Tbe region m « microamy bm 

i poly-Hysoe coated «lide, where toe total are* covered by « aimeasiocB, e.e.. diameters, in the ranee of between 

tbe 400 cluneal imy a 16 squire mfflimeters; .i^iaST ^Z™.? i~ 01 betwteo 

. ~ , , . loom 10-2X1 jon, tod we scpmied tram otber repou is 

FIG. 6 u i nnoweeot usage of a l£ coxU ea th e array bv ahnut the umc distance. ■ 

mieroarray cootainiog lambda with yeast iras, tbe A ^ wrfMe h - hydrophol>ie - 5f t , quemwmedium 

fiuoracem atgoal .nsutg from tbe brbndmmm totbe array dro ^ plied „, ^ Jr,Tdoe 5 no. spreX, 

^'u^^ ^ybeycodtbearea^o^eappHedTrople^t^ 

greeo uuoropoure ub 9»i d*w who a iw uiwropoore, mrface acts to prevent spreading of tbe droplet applied to the 

RG. 7 shows the translation of tbe hybridization image of $ur fcce by hydrophobic interaction with the droplet 

FTG. 6 into a karyotype of the yeast genome where the A - ocniiCur mc4ttS t aDavc or mvtLX 5urftct ^ 

elements ofFlG. 6 mtcroamy contam yeast DNAsequeooes fonnt „ tbc boUoffl of t hquid m , ctuBoel w , rc$uh of lbc 

that have been previous^ physically mapped in the yeast 45 t ensioD of ^ bquid 

gC ^ C; . w a . , „c ^ r "Distinct biopolymers", is applied to the biopolymers 

HG. 8 shows a fluorescent image of a 03 cmxOJ cm for^g 4 microarrav. means an array member which is 

micruarray of 24 cDNA dune*, where the micTuamy way dijliDCX fr0ID omcr imjy mcm bcr5 on the basis of a differeoi 

hybridized simuluncousry with total cDNA from wild type biopolvmer sequence, aod/or different cooceotnuoos of the 

Arabidopss plant Ubeled with a greeo fluorophore and total „ mc m diMinct binpnlymer*. and/or different mixrur» of 

cDNA from a transgenic Arabidopsis pUot labeled with a disliDC1 or afferent-concentration biopolymers. Thus an 

red fluorophore, and the arrow points to the cDNA clone irrty of -distinct polvnuclcotides- meaos ao array 

representing tbe gene introduced into the transgenic Arab). containing as its members, (i) distinct polvDudeoiidcs, 

dopsis plant; aj wbicb Bay h tV e a defined amount in each mcmbcr. (iij 

F1G.9 snows a plan view of wbsirate having an amy of different, graded concentrations of given-sequence 

cells formal by barrier clement* in the form of a grid; polynucleotides, and/or (in) differem-enmposiuoo mixtures 

FIG. 10 shows ao enlarged plan view of ooe of the cells of two or more distinct polynucleotides, 

in tbe substrate in FIG. 9, showing an array of polynuclc- "Cell type** meaos a cell from a given source, e.g., a 

otide regions in tbe cell; w tissue, or organ, or a cell in a given state of differentiation, 

HG. 11 is an enlarged sectional view of the substrate in or a cell associated with a oven pathology or geoetic 

FIG. 9, taken along a section line in that figure; and makeup. 

RG. 12 is a scanned image of a 3 cmx3 cm nitrocellulose U. Method of Micmarray Formation 

solid support containing four identical arrays of M13 clones This aeciioo describes a method of forming a microarray 

in each of four quadrants, where each quadrant was hybrid- 65 of analytc-assay regions on a solid support or substrate, 

xzed simultaneously to a different oligonucleotide using ao where each region in the array has a known amount of a 

open face bybridizatioo method. selected, aoalyie-specific reagent. 
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HG. 1 inwnu*;* » P«uaUy •cbeomicvi.w.a ftageat- and .way foa> tbe substrate surf act, Blkm£ momentary 

dispensing device 10 useful in practicing tbe tnetbod. The contact with tbe surface, is efleei, isppiae fee op of the 

device generally includes > reagent dispenser 12 having eo dispenser against tbe support surface. Tbeuppini move- 

elongate open capillary channel 14 adapted to bold a nun- sent of tbe tip against tbe surface acts to break tbe b'ouid 

oty of tbe reagent solution, sucb as indicated at Id as wfll i n*msw in the lip idunnet faring the liiiu^ ire nia 

be described below. The capillar)- channel is tome* by i comic win, tbe support surface. This, ionm woducts a 

pair of spaetd^part. eoextenam, elongate members 12a. Sowing of tbe baud into tbe apiliirv space between tbe tie 

126 wbicb are tapered toward one another and converge at and tbe surface, acting to draw iiouid oat of tbe disecour 

a up or tip regios IS at tbe lower end of tbe channel. More channel, as seen in FIG. 2B. — »— 
generally, the open channel is formed by at teas two 10 fig 2C show flow af*»i.« 

cbng.u. ^^nw^n adapted to bold a <*an*y J£*2ZZ £ ^^^"^ 

of r^gem «o1wk«s «d having a up region at wbicb ^ mufintes to ^ cotK to fic^n^ 

auuaiu* Mtlutmn in the channel form* a mcnwuis, Midi ax diiocB^r onto the vimtm . 7: ^ 

the concave meniscus illustrated al 20 is FIG. 2A. The fiSiS J? 8 ^2^^^".??*^ 

in ^acuued below. „ r , ^. bydxppbobic surface interaction of itaitK^ 

With continued reference to FIG. 1, tbe dispenser device surface, wbicb acts to limit tbe total bead atea on tbe surface 

abo iochides stniaure for moving tbe dispenser rapidly and by tbe surface tension of tbe dmpkt, wbid leads towirt! 

toward and away from a support surface, lor effecting a given bead curvature. At Urn point, a riven bead volume 
deposition of a known amount of solution in tbe dispenser on * will bave formed, and continued contact of tbe disvcoscr^ 

a support, as wfl] be describe* below witb reference to FIGS. witb (be bead, « tbe dispenser tip is beinc withdrawn will 

2A-2C. In tbe embodiment shown, tho structure indudes a bave bole or no effect on bead volume 

solenoid 22 wbicb is activatabk to draw a solenoid piston 24 e« iimnH_/4..~*~; «~ s_ ^ . . 

referred In herein «s uisjienxing man* for mcivrnv the _ , . , " .. *" *° «°° 

dispenser into engagement witb a solid support, for dispeos- . ™? doar^ Uepusmuo volume. Le.. bed volume, funned 
iog a known volume of fluid on the support. X ■» in tbe range 2 pi (piooliters) to 

In. d-pensmg dcv,cx m described is carried on an era 2 JL ( ^£^ TSST^^l 1 * ^ 0r . B, ° n! 

position tbe dispenser « a selected depoabon dosiuocl as a*—^* ^ : . , k . . , 01 

wiD be cescribed dtf penaer Up, Le. tbe sue of tbe area spanned by tbe tip, (ii) 

_„ . A . . . . , « ** byoTopbobiary of tbe support surface, and (iu) tbe time 

HCS.^-2Cmusmtt«c^ « ofcnmactwitha^rauofwl^ 

imouni of reagent solution m tbe ^nJescribed dispenser on support surface. In addition, bead sis may be reduced by 

tbe surface of a solid support, sucb as tbe support indicated increasing tbe viscosity of tbe medium, effectively reducing 

support bavmg a surface amcated at 31. ^ surface. Tbe drop sue may be further constrained by depos- 

In one geoeraJ embodiment, tbe surface is a relatively iting tbe drop in * bydropbilic region surrounded by a 

bydropbilic, wctuble surface, sucb as a surface baving bydropbobic grid putem on tbe support surface 

"live, ^ um, j or ^'^Jy * llJcbc4 I »w«P^ Ooe Id * typical embodiment, tbe dispenser tip is tapped 

sucb surface described below is a glass surface baving an rapidly tpinti the tuppun sur£ice, with . total ixJence 

atoorbed layer of a polycatiomc polymer, sucb as poly-l- 45 time in contact with tbe support of less than about 1 msec. 

lysmc * and a rau of upward travel from tbe surface of about 10 

In another embodiment, tbe surface bas or is formed to cm/sec. 

h»ve a relatively hydrtiphol»ic character, ix, one that cau*e* Assuming that tbe bead that forms on contact with tbe 

iqueousmcdium deposited on tbe surface to bead. A variety «,rf 1C e is a hemi^hcrical txad, with a diameter annroii. 

of known bydropbobic polymers, sucb as polystyrene, K , equal to tbe width of tbe dispenser tip, as sbown in 

polypropylene, or polyethylene have desixed bydropbobic FIG. 2C, tbe volume of tbe beadformed in relation to 

properties, as do glass and a variety of lubricant or other dispenser tip width (d) is given in Table 1 below As seen, tbe 

bydropbobic films tbst may be applied to tbe support sur- volume of tbe bead ranges between 2 pi to 2 nl as the width 

* ,cc - sue is increased from about 20 to 200 *^m. 

Initially, tbe dispenser is loaded witb a selected analyte* 55 

kpccifii; reagent solution, sucb as by dipping tbe dispenser TABLE 1 
bp, ancr washing, into 1 solution of the reagent, and 
allowing filling by capillary flow into tbe dispenser channel. 
Tbe dispenser is now moved to a selected position with 
respect to a support surface, placing tbe dispenser tip 
directly above tbe support-surface position at wbicb tbe 
reagent is to be deposited. Tbis movement takes place with 
the dispenser tip in it* raised nonitinn, as Men in RG. 2 A, 

wberc tbe tip is typically at least several 1-5 mm above tbe Al a given tip size, bead volume can be reduced in a 

tu ™* Pirate. 65 controlled fashion by increasing surface hydrophobic^, 

with tbe dispenser so positioned, solenoid 22 is now reducing time of contact of tbe tip witb Ibe surface, inert at- 

aaivated to cause tbe dispenser tip to move rapidly toward ing rate of movement of tbe tip sway from tbe surface, 
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and 'or acre sung the viscosity of ibe medium. Once these Solenoid 7ft is voder the control of a control unit 77 wbose 

parameters arc fixed, a selected deposition volume ia the operation will be rirvntaed below. The soieooid it also 

desired pi to ol range can be achieved is a reputable referred to herein is dispensing means for moving tbc device 

fashion. into tapping engagement with a support, wfaeo the device is 

After depositing a bead ai one selected location on a * positioned ai a defined array position with respect to that 

s up p or t, the tip is typically moved to a corresponding support 

position on a second support, a droplet is deposited ai that The rtisprmrr device is earned on an arm 74 which is 

position, and this process is repeated until a liquid droplet of ihrcadtdly mounted oo a worm screw 80 driven (routed) in 

the reagent has been deposited at a selected position on each a desired direction by a stepper motor 82 also under the 

of a plurality of supports. 30 ccctrol of unit 77. At its left end in the figure screw 80 is 

Tbe tip is then washed to remove the reagent liquid, filled carried in a xlcevc 84 for nuamm about the mjew nU. At 

with another reagent liquid and this reagent is now deposited its other end, tbe screw is mounted to the drive shah of tbe 

at each another array position on each of the supports, la one stepper motor, which in turn is carried oo a sleeve 86. Tbe 

embodiment, the tip a washed and refilled by tbe steps of (i) dispenser device, worm screw, the two sleeves mounting tbe 

dipping tbc capillary chance) of the device in a wash 35 worm screw, and the stepper motor used m moving tbe 

solution, (n) removing wash solution drawn into the capi!- device in tbe -*T (horizontal) direction in tbe figure form 

lary channel and (iii) dipping the capillary channel into the what is referred to here collecuvcly as a displacement 

new reagent sofuuoo. assembly 86. 

From tbe foregoing, it wiU be appreciated that the Tbe dkplauemcni assembly is constructed iu produce 

rweerervlike, ope/v<apfllar> dispenser tip provide* the 30 precise, micro-range movement in tbc direction of the saw. 

advantages that (i) tbc open channel of the tip facilitates t*c, along an x axis in tbe figure. In one mode, tbc assembly 

rapid, efficient washing and drying before reloading tbc up functions to move the dispenser in x-axis incremenu.having 

with a new reagent, (ii) passive capillary action can load tbe * selected distance in the range 5-25 /on. In another mode, 

sample directly from a standard microwcU plate while the dispenser unit may be moved in precise x-axis inert- 

retaining sufficient sample xn the open capillary reservoir for 25 menu of several microns or more, for positioning tbe 

tbe printing of numerous arrays, (iii) open capillaries are less dixpemer at axxnciaicd position* on adjacent supports, a* 

prone to clogging than closed capillaries, and (iv) open will be described below. 

capillaries do not require a perfectly faced bottom surface Tbe displacement assembly, in tun, is mounted for move- 

for fluid delivery. x menl in tbe "y" (vertical) axis of tbe figure, for positioning 

A portion of a micro array 36 formed on tbe surface 38 of tbe dispenser at a selected y axis position. Toe structure 

a solid support 40 in accordance with tbc method just mounting the assembly includes a fixed rod 88 mouoted 

described is shown in FIG. 3. Tbc array is formed of a rigidly between a pair of frame ban 90, 92, and a worm 

plurality of analyte<«pecific reagent regions, such as regions screw 94 mounted for rotation between a pair of frame bars 

42. where each region may include a different analyte* 96, 98. The worm screw i* driven (muted) by a stepper 

specific reagent. As indicated above, tbe diameter of each * motor 100 which operates under the control of unit 77. Tbc 

regioo is preferably berweeo about 20-2UU /ax. J ne spacing motor is mounted oo bar 96, as shown, 
between each region and its closest (noo«<JiagooaI) neighbor, Tbe structure just described, including worm screw 94 

measured from centeMn<center (indicated at 44), is prefer* and motor 100, is constructed to produce precise, micro- 

ably in the range of about 20-400 ^on. Thus, for example, an ^ range movement in the direction of the screw, ix., along a 

array having a centeMo-cenier spacing of about 250 fan y axis in the figure. As above, tbe structure functions in ooe 

contains about 40 regions/cm or 1,600 regions/cm*. After mode to move tbe dispenser in y.ixi* increments having a 

formation of tbe array, the support is treated to evaporate the selected distance in the range 5--250 /<m, unit in • tccuntl 

liquid of tbe droplet forming each region, to leave a desired mode, to move tbe dispenser in precise y-axis increments of 

array of dried, relatively flat regions. This drying say be ^ several microns Oan) or more, for positioning tbe dispenser 

dooe by beating or under vacuum. at associated positions oo adjacent supports. 

In some cases, it is desired to first rebydratc tbe droplets Tne displacement assembly and structure for moving this 

containing tbe analyte reagents to allow for more time for assembly in the y axis are referred to herein collectively as 

adsorption to the solid support. It is also possible to spot out positioning means for positioning the dispensing device at a 

the analyte reagent* in a humid environment sn that droplet* ^ selected amy position with respect to a support, 
do not dry until the arraying operation is complete. a bolder 102 in the apparatus functions to hold a plurality 

ID. Automated Apparatus for Forming Arrays of supports Mich a* supports 104 on which the micmamy* 

In another aspect, the inventioo includes an automated of reagent regions are to be formed by tbe apparatus. Toe 

apparatus for forming an array of analyte •assay regions on bolder provides a number of recessed slots, such as slot 106, 

a solid support, where each region in the amy has a known 55 which receive tbe supports, and position them at precise 

amount of a selected, analyte -specific reagent selected positions with respect to the frame bars on which 

Tbe apparatus is shown in planar, and partially schematic tbe dispenser moving means is mounted, 
view in FIG. 4. A diffr*-T device 72 in the apparatus has As noted above, tbe control unit in tbe device functions to 

tbe basic construction described above with respect to FIG. actuate tbe two stepper motors and dispenser solenoid in a 

1, and induct a dispenser 74 having an opco<apiUary «> sequence designed for automated operation of tbc apparatus 

channel terminating at a tip, substantially as shown in FIGS. in forming a selected micro t my of reagent regions 00 each 

1 and 2A-2C. of a plurality of supports. 

Toe dispenser is mounted in tbe device for movement Tbe control unit is constructed, according to conventional 

toward and away from a dispensing position at which tbe tip microprocessor control principles, to provide appropriate 
of tbc dispenser taps a support surface, to dispense a selected « signals to each of the solenoid and each of tbc stepper 

volume of reagent solution, as described above. Tnis move* motors, in a given limed sequence and for appropriate 

ment is effected by a solenoid 76 as described above. signalling time. Tbe cooxruaion of the unit, and tbe sellings 
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that art selected by the user to achieve a desired amy T be constmehon of substrate is shows crow^secuouallv 

pattern, will be uwteroood from ibe folksong tkscnpuoo of id FIG. U, wtucn is ao enlarged sectional vxTu^Tioni 

a typical apparatus operation. view hoc 124 ia FIG. 1 Toe substrate includes a watcT 

Initially, ooe or sore supports are placed in one or sore impermeable backing 126, suet as a glass slick or ri&id 

slots in the bolder. Tbe dispenser is moved to a position 5 polymer sheet. Formed 00 tbc surface of tbc backing is a 

directly above 1 well (not shown) rousing a solution of water-permeable film 128. Tbe film is formed ofaporous 

tbe first reagent to be dispensed on tbe supposes). The membrane material such as nitrocellulose membrane^r a 

dispenser solenoid is actuated now 10 tower the dispenser up porous web material, such as a nylon, porvpropvicne or 

into this well, causing the capillary channel in the dispenser PVDF porous polvmex material The th.vw-p^ 0 f tbe t\a is 

to fill Motors 82, 100 are dw actuated to position the preferably between about 10 and 1000 m Tbe film mav be 

dispenser at a selected amy posiuon at the first of tbe applied to tbe backing by spraying or coatinc uncured 

support*. Sulenuu) aouauoo of the ibspcnser is then effa> material 00 tbe backing, or bv applying a preformed mem. 

bve to dispense a sclected-volume droplet of that reagent at brane to tbe backing. Tbe backing and film say be obtained 

this location. As noted above, this operaiion is effective to as a preformed unit from commercial source ex a elastic, 

dispense a selected volume preferably between 2 pi and 2 nl backed nuroccDulose film available from Schleicher *nd 

of tbe reagent soluuon. " Schuell Corporation. *°° 

Tbe dispenser is cow moved to the co n rajNir vti ng posiuon W«b continued reference to FIG. 1L the film-covercd 

at an adjacent support and a similar volume of the sohiuon wface in tbe substrate is partitioned into a desired array of 

is dispensed at this position. Tbe process is repeated until the ***** &7 wteNmpermeablc grid lines, such as lines 130, 

reagent bas bees dispensed at this preselected corresponding , m<mc& bavc «fito»«ed the film down to tbc level of the* 
position on each of the support*. * takin g, *nd extend above the surface of the film as shown. 

Where it » owired to dispense a single reagent at more Wwaliy a distance of 100 to 2000 ass above the film 

than two array positions 00 a support, the dispenser may be S °^^ e * 

moved to different array positions at each support, before *^ linoi * n frwmcd cm the substrate by laying diiwn 

moving tbe dispenser to a new support, or sortition can be 40 ttnaBed a otherwise flowable resin or elastomer soluuoo 
dispensed at indivtdual positions on each support, at one 35 j? M * rr *y jnd, allowing the material to infiltrate the porous 

selected position, then the cycle repeated for each new array down . 10 106 b * ciciD i» 1060 curing or otherwise harden, 

position. uig the grid lines to form the ccU-array substrate. 

To dispense the next reagent, tbe dispenser is positioned 006 material for the grid is a flowable silicone 

over a wasb solution (not sbownX and the dispenser tip is «^*kfrwn Locuus Corporation. Tbe barrier material can 
dipped in and out of this solution until tbe reageni sohiuoo 30 ennided trough a narrow syringe (eg., 22 gauge) using 

bas been substantially wubed from the tip. Soluuon can be ™ Pressure or mechanical pressure, Tbc syringe is moved 

removed Irum the tip, after each dipping, by vacuum rc ~. ve 10 uc W PP°" "> print the barrier elements as 

compressed air spray, sponge, or tbc like. ' 1 gnd pattern. The-cxtruded bead of silicone wicks into tbe 

Tbc dispenser tip is now dipped in a second reagent wclL » £"7* ?! *° lid * lppon Md curc4 10 t°tm a shallow 

and tbe filled tip is moved to a second selected array position w * lef P roo( tmcT *P»tmg the regions of the solid sup- 

in tbe first support. Tbe process of dispensing reagent at eacb . ... 

of tbe corresponding Kcoud-array positions is men carried J 0 ? 1 "?** ^ 3odia f^ element can be a 

out as abovT This process is rested untTan"n*e ° tlem or 1 lbtnD05Cl ***** *"* * epoxy. 
microarray of reagent solutions on each of the supports bas ^ ™**™« ™£\cntte>\K a UV<uring polymer wtricb 

been formed. csr**aj to UV light after being printed nmo the »>lkl 

IV MicroarraY Substrate ,Uppon " 766 bimer matcril1 m,v lbo ** *PP«ed 10 the solid 

..mwwn^uwiK support using printing techniques such as silk-screen print- 

Tha secuon Ascribes er^bodtmcnts of a substrate having mg . 7* barriermaterial may also be a beat-s^^mpK 

a microarnyof biological polymers earned on the substrate U e porous solid support which seals its pom J3 fo™ a 

**if*^r" h^!+ "1 describes a mulu^ell substrate, each <s water-impervious barrier element. 'J he bame^ra^teriaUnay 

cell of which conuins a microarray and pteferubly an also be a shallow grid which u laminated oTSSwS 

identical microarray. of distinct copolymers, such as dis* «Jbered to tbe sobd support otnerwae 

£?2S 10 ftddib0B 10 Pta^'ctod nitrocellulose, the solid 

000 B descrfces a miaoarray of dwina polynucleotides auppon can be virtually any porous membraoe with or 

^ aTt^Tc u ^^^^^mcpolymcu so without a non^rous backmg" Such 

£ Mulu-CeU Substrate ^ Duajcroui v J 6on 1Dd m mtQC ^ g 

HG.9dlustnte^wplanv^«bstrattU0con^^ PVDF, polysulfone and tbe like. In an alterative 

according to ma The substrate bas an 8x12 embodiment, tbe barrier element may also be used to adhere 

rectanguUr array U2 of cells, such as cells 114. U«. formed the porous membrane to a non-porous backing in addihon to 

on the sutau-ate surface. With rctereoce to FIG. 10, each ceil, 35 funaioning as a barrier toprevent cross contaminatioo of the 

sucb as cell 114, 10 turn supports a microarray US of distinct assay reagents. 

bun^yrnerx, m <* 1 a» p^Udcs iw ptilynuclauidcs at In an alternative emhi»dimenu the solid sumnm can he of 

known, addressable regxms of tbe microarray. Two such a ooo-porous material. Tbe bamer can be printed either 

regions forming tbe microarray art indicated at 120, and before or after tbe microarray of biomolecules is printed on 

correspond 10 regions, such as regions 42, forming tbe w the solid support 

mi ^ l ^^ 1 diSti0C, ^^.^r 0 * nG ' 3 * * cin 66 *PP^»*d. »be cells formed by the grid lines 

ine yfrosU array stown in HO. 9 typicaUy bas array and the underlying backing are water- impermeable, b.vtng 

du«ns,ons between about 12 and 244 mm tn width and 8 side barriers projecting above tbe porous film in me celU 

W u ^ WS *^ ^"J** my bivifl * ^ dcfined-volume samples can be placed in eacb well 

widlb and length *rncnsion of tti and W tbc array wtdtb and cs without risk of cross<ootaminat,oo with sample material in 

tagtb toenaons, re^ecnvtly ,.e between about 1 and 20 aojacect cells. In FIG. U, defined volumes samples, sucb as 

m width and land 50 mm in length. sample 134, are shown in tbe cells 
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As ooied above, ucfa well contains a oiooiiray of Jo fans the microarray, defined volumes of dtttsct 

distinct biopolymers. Id one general embodiment, tot polynucleotides art deposited oo toe poMDer<oaied slide, 

microamys is tbc well art identical arrays of distinct u desafced in Section 0. According l ° important feature 

biopolymers, e.g. f different sequeoce poiyouclt wide*. Such 0 f tbc substrate, the deposited polynucleotides remain bound 

may* cao be formed in accordance with the met both 5 to the coaled slide surface ooo-covalently wben an aqueous 

described in Scctioo II. by depositing a first aekoed poxy- DNA sample is applied to the sobs rite voder coodiuoos 

oudcoude at the same selected microamy posinon in tact . which allow hybridization of reporter-labeled polyoucle- 

of the cells, then depositing a second polynucleotide at a oudea in the sample to eonmiemenury-sequcnot (single- 

different microarray posiuoo in each well and so 00 until a stranded) polynucleotides in the substrate array. Tbc method 

complete, ideouca) microamy is formed in each cell. jc a illustrated in Examples 1 tod 2 

Is a preferred embodiment, each microarray contains To illustrate (his feature, a substrate of the type just 

ahmit 10* distinct pnlymavkiiiidc «r pcil>peptuJe hiupnly- described, but having an array of same-sequence 

meo per surface area of leas than about 1 cm 3 . Also to a polynucleotide*, wax miied with nuoreKcent-lshclcd 

preferred embodiment, the biopolymcrs in each microarray complementary DNA under bybridizauon cooditioos. After 

region are present in a defined amount between about 0.1 jj washing to remove non-hybridized material, the substrate 

fcmiomoles and 100 nanomoles. The ability to form high- wis examined by low-power fluorescence cnicroscopy. Toe 

density arrays of biopolymers, where each region is formed array can be visualized by the relatively uniform labeling 

of a well-defined amount of deposited material, can be pattern of toe array regions. 

achieved in accordance with the microamy-forming method In a preferred embodiment, each microarray contains at 

deacrmcd in Section II. 20 least 1(T distinct polynucleotide or polypeptide biopolymers 

Also in a preferred embodiment, the biopolymers art per surface area of leas than about 1 cmf In the embodiment 

polymicleoiirles having lengths of 11 least about 50 bp, Le., shown in FIG. 5, the microamy contain 400 regions in an 

substantially longer than oligonucleotides which can be area of ahuui 16 mm 3 , ir 23x10* regiiirWciTi 3 . AUi in a 

formed in nigh-density arrays by schemes involving parallel, preferred embodiment, the polynucleotide* in each microar* 

step-wise polymer synthesis on the amy surface, 25 ray region are present in a defined amount between about 04 

In the case of a polynucleotide amy, in an assay femiomoks and 100 nanomoles ia the case of polycucle- 

procedure, a small volume of the labeled DNA probe mix- otide*. As above, the ability to form bigb-deosit v array* of 

rure in a standard hybridization solution is loaded onto each this type, where each region is formed of a well-defined 

cell. The wilutkm will xpread to ewer the enure miutiamy amount of deposited material, cao be achieved in accordance 

and stop at tbc barrier elements. Tbc solid support is then x> with the mieroamy-forming method described in Sectco a. 

incubated in a humid chamber at toe appropriate temperature Also in a preferred embodiment, the polynucleotides have 

as required by the assay. lengths of at least about 50 bp, U„ substantially longer than 

Each assay may be conducted in an "open-face" format oligonucleotides which cao be formed in high -density arrays 

wbere no further sealing step is required, since the hybrid- by various in situ synthesis schemes, 

saiion solution will be kept property hydra ted by the water 35 V. Utility 

vapor in the humid chamber. Al the oooclusioo of the Miooamys of immobilized nucleic acid sequences pre- 
incubation step, the entire solid support containing the pared in accordance with the invention can be used for Urge 
numerous microamys is rinsed quickly enough to dthite the scale nybridizaooo assays in numerous genetic applications, 
assay reagents *> that 00 significant out* umismination including genetic and physical mapping of genomes, moni- 
occurs. The entire solid support is then reacted with detec- «> mring of gene expression, DNA wquencing, genetic 
tioo reagents if needed and analyzed using standard diagnosis, gcootyping of organisms, and distribution of 
calorimeuic, radioactive or fluorescent detection means. All DNA reagents 10 researchers. 

processing and detection steps arc performed simurta* For geoe mapping, a gene or a cloned DNA fragment is 

neously to all of the microamys on the solid support hybridized to an ordered array of DNA fragments, and the 

ensuring uniform assay conditions for all of the microamys 45 identity of the DNA elements applied to the array is unam- 

un the solid support. biguously established by the pixel or partem of pixels of the 

B. Glass-Slide Polynucleotide Amy amy that arc detected. One application of such arrays for 

FIG. 5 shows a substrate 136 formed according to another creating a genetic map is described by Nelson, et a! (1993). 

aspect of the invention, and intended for use in detecting In constructing physical maps of the geoome, arrays of 

binding of labeled polynucleotides to one or more of t so immobilized cloned DNA fragment are hybridized with 

plurality distinct polynucleotides. The substrate includes a other cloned DNA fragments to establish whether the cloned 

glass substrate 138 having formed on its surface, a coating fragments in the probe mixture overlap aod are therefore 

of a polycationic polymer, preferably a cationic polypeptide, coouguous to the immobilized clones on the amy. For 

such as porylysine or polyargisine. Formed on the polyca* example, Lebrach, et al, describe such a process, 

tionic coating is a microarray 140 of distinct 55 The arrays of immobilized DNA fragments may also be 

polynucleotides, each kKilirrd at known selected amy used for genetic diagnostics. To illustrate, an amy contain- 

reguins, Kuch a> regitm> 142. ing multiple forms nf a mutated gene or gene* can he pruned 

The slide is coated by placing a uoifcranbickness film of with a labeled mixture of a patient's DNA which will 

a polycationic polymer, eg., poly^-lysioe, 00 the surface of preferentially interact with only ooe of the immobilized 

a slide and drying the film to form a dried coating. The to versions of the gene. 

amount of polycationic polymer added is sufficient to form The detection of this interaction can lead to a medical 

al least a monolayer of polymers 00 the glass surface. Toe diagnosis. Arrays of immobilized DNA fragments can abo 

polymer film is bound to surface via electrostatic binding be used in DNA probe diagnostics. For example, the identity 

between negative stlyl-OH groups 00 tbc surface aod of a pathogenic microorganism can be established unarn- 

charged amine groups in tbc polymers, Poly-J-ly&ine coated cs biguously by hybridizing a sample of the unknown patho- 

glaas slides may be obtained commercially, e.g., from Sigma gen's DNA 10 an amy containing many types of known 

Chemical Go. (Sl Louis, Mo.). pathogenic DNA. A similar technique can abo be used for 
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unambiguous genotyping of any organism. Other molecules j oc following examples flhuuate, but io do way ait 

of genetic interest, such ** cDNAs and RNAs can be mtended to limit, tot prcscoi iovtatioa. 
immobilized oo tbe array or alternately used as the labeled 

probe mixture that is applied to tbe array. tXAMFLK 1 

o JILT Z^Ti 5 Ckoomic^omplaaty Hybridaatum to DNA 

genes » hybridized with i total cDNA from ao orgaaam to Microamy* Rcpreseoting the Yeast Saaeharvmyca 

monitor geoc cxpi^smofoT research or dia^osucpurp««. crmtwr Geooipt wS Two^toM^oSw 

Labeling toul cDNA from a normal cell with ooe color ^^Deteoioa "w^*™ 

fluoropbore aod total cONA from a rhviwfl cell wiib _ ^ 

another color fluoropbore aod simultaocously hybridizing je J*** m ? elements were randomly amplified PCR 

tbe rwo cDNA samples to tbe same array of cDNA closes (BonUnder, et al, 1992) products using physically mapped 

allows for differential gene expression to be measured as tbe 000 ° r S crrrx ' i *™' geoomx DNA a* tempi* to 

niiii cif the two fluiimprmrc imenxuiev Thw twins** (Riles, ct al, 1993). Tbe PCR was performed directly oo tbe 

experiment can be used to monitor gene expression in * vsales * in an amplification of both 

different tissue types, disease states, response to drugs, or is ™ 35 tt Umbdl veaor the 5-15 kb yeast insert 

response to eoviraxmeatal moors. An example of mis >c ^ ttcn cc * ffl ue * onB °f * uniform distribution of PCR 

approach is illustrated in Example 2, described with respect product be,wt,co 250-1500 base pain in lengtb. 1 be PCR 

io HG. 8. product was purified using Scpnadex G50 gel filtrarioa 

By way of example aod without implying a limitation of (^wmacia, Pocataway, NJ.) and conccnuated by evapo- 

scopc such a procedure could be used to simultaneously * nr * 0L w °^ fncs * 41 room temperature overnight. Etch of tbe 

screen many patients against all known mutations ma 864 u&Plified lambda doocs was rebydrated in 15 Ml of 

disease gene. This invention could be used in tbe form o£ for 3xSSC » P^" 000 tor spotting onto tbe glass, 

example, 96 identical 0.9 cmx2.2 cm microarray* fabricated microanays were fabricated on microscope slides 

oo a single 12 cmxlfi cm sbect of plastic-badecd mirocel- which were coated with a layer of poly-Myjune (Sigma). Tlx 

rulose wbere each microamy could contain, for example, 2i •uomated appararus described in Section III loaded 1 **1 of 

100 DNA fragments representing all known mutations of a uc concentrated lambda clone PCR product in 3xSSC 

given gene. Tbe region of interest from each of tbe DNA directly from 96 well storage pbio» into tbe open capillary 

samples from 96 pabeots could be amplified, labeled, and printing clement and deposited -5 nl of sample per slide at 

hybridized to tbe 96 individual arrays with each assay 380 micron spacing between spots, on each of 40 slide*. Tbe 

performed in 100 microliters of hybridization solution. The x process was repeated for all 864 samples and & control spots, 

approximately 1, thick silicone rubber barrier cluneals mc spotting operation was complete, tbe slides were 

between individual arrays prevent cross-onatamiaauoo of rebydrated in a humid chamber for 2 hours, baked in a dry 

tbe patient samples by sealing the pores of tbe utrccellulose W vacuum oven for 2 hours, rinsed to remove unabsorbed 

and by acting as a physical barrier between each microamy. ^ N A an ^ lrten seated with wiccinic anhydride in reduce 

Tbe solid support captaining all 96 mieroamys assayed with 2$ °»*spccific adsorption of tbe labeled hybridization probe to 

the 96 patient samples is incubated, rinsed, detected and tbe poly-l-lysine coated glass surface. Immediately prior to 

analyzed a* a single sbect of material using standard ***** 106 immobilized DNA oo tbe array was denatured in 

radioactive, fluorescent, or colorimetric detection means disulkd water at 90* for 2 minutes. 

(Maniatas, ct aL, 1989). Previously, sucfa a procedure would For tbe pooled chromosome experiment, tbe 16 chromo- 

involvethc riamJIir^, rmKx^ngaiul inwiing of 96 separate an somes of Saccharomyca ccrrvuiae were separated in a 

membranes in 96 separate sealed chambers. Dy processing CHEF agarose gel apparatus (Biorad, Richmond, Calif.), 

all 96 arrays as a single sheet of material, significant time Tbe six largest chromosomes were isolated io ooe gel slice 

and cost savings are possible. and the ten smallest chromosomes in a second gel slice. Tbe 

Tbe assay format can be reversed where tbe patient or DNA was recovered using a gel extraction kit (Qiagen, 
organism's DNA is immobilized as the array elements and 45 Cbaiswurth, Calif.) Tbe two chromosome puuU were no- 
each amy is hybridized with a different mutated allele or domiy amplified in a manner similar to that »«cd for the 
geoetic marker. Tbe gridded solid support can also be used target lambda doocs. Following amplification, 5 micro* 
for parallel non-DNA HI .ISA assays. Furthermore, the grams of each of the amplified chromosome pools were 
invention allows for the use of all standard detection meth* separately random -primer labeled using Klenow polymerase 
ods without tbe need to remove tbe shallow barrier elements so (Amerabam, Arlington Heights, 111.) with a i^mn* con- 
to carry out the detection step. jugated nucleotide am log (Oupont NEN, Boston, Mass.) for 

In addition to the genetic applications listed above, arrays lr *e P°° l containing the mi largest chromosomeA, and with a 

of whole cells, peptides , enzymes, antibodies, antigens, fluorescein coojugated nucleotide analog (BMB) for tbe 

receptors, ligaads, phospholipids, polymers* drug cogeoer P°°l containing ten smallest chromosomes. Tbe rwo pools 

preparations or chemical subsiaoors can be fabricated by tbe 35 **** mixed aod concentrated using an ultrafiltratioo device 

means described b this invention for large scale screening (Am icon, Dan vers. Mass.). 

axKayfc in medical Uiagremioi, drug dtxawery, molecular Five micrograms of tbe hybridization probe coosisung of 

biology, immunology and toxicology. both chromosome pools in 7.5 /d of TE was denatured in a 

The multi<cll substrate aspect of tbe invention allows for boiling water bath aod then snap cooled on ice. 25 /d of 

tbe rapid and convenient screening of many DNA probes w concentrated hybridization solutioo (5xSSC and 0.1% SDS) 

against maoy ordered arrays of DNA fragments. This elimi- was added and all 10 *U transferred to the amy surface, 

nates tbe need to handle aod detect many individual arrays covered with a cover slip, placed io a custom-built single- 

for performing mass screenings for geoetic research and slide humidity chamlxr and incubated at 60* mr 12 hours, 

diagnostic applications. Numerous oucroarrays can be fab- Tbe slides were then rinsed at room temperature in O.lxSSC 

ricatcd on tbe same solid support and each microamy cs sod 0.1% SDS for 5 minutes, cover slipped and scanned, 

reacted with a different DNA probe while the solid support A custom built laser fluorescent scanner was used to 

is processed as a single sheet of material. detect the two-color bybridizatioo signals from the 14x1.8 
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co array at 20 micron resolution, *lne scanned image wis reaction in a manner similar to Example 1 kinsmi and 

griddtd and analyzed using custom image aoalysis software. detection of hybridization «u also performed mamanner 

After correcting for optica] crosstalk between the ftuoro- similar to Example 1. FIG. 8 shows toe rcsulaat brbndizj- 

pnorcs due to their overlapping emission spectra, the red and tioo pattern of the array. ' 

green hybridization values for each done oo the array wert i - M ' . 

correlated to the knows physical map r«sition of the done A ~ expressed tn wild type and the transgenic 

resulting in a computer-generated color karyotype of the ^«P« appeared yellow due to equal contributions of 

veast^oome. the green and red florescence to the final signal Tbe dots 

* , , . . . ... , . , ut different intensities of vellow ipdinting vinnu levels of 

FIG. 6 sbows the hybridization pattern of the two chxo- enrtulM -tv. en ' NA M lcvtU01 

mosome pools. A red signal indicates thai tbe lambda clone ie f^^^u!^ f^l^* representing the tran ; 

on the array surface contains a cloned genomic DNA seg. Tl^J^^^'^^ m lhe }'* «* 

mcnt from one of the six largest yeast chromosomes. A crcco deteaably expressed in wdd type 

DCDi uuD k , .lTi w7 7 rr HMwaowBw^Ajreco Arabidopstt, appears as a red dot (with tbe arrow oointinc to 

s*nal indicates ^ toeUmbda clone insert comes from one i0> ^preferential exprc«iuo uT u^S^ 

of the ten smallest yeast ch^ (ay «pb factor in the red-labeled traosienic Arebidops»^ £ 

indicate repcouve sequences wb>ch cross bybndixed to both » rtU ovt lack of expression of^tnnscnptioTacUt £ 

chromosome pools. Control spots on tbe amy confirm that greeo-labeled wild type Anbidopsis 

the bvbridizaboo is specific and reproducible. r 

The pbysical map locations of the geoomic DNA frig- *°>' ,ou * e of ^croarray hybridi/ation format for 

menu contained in each of the clones used as arrav elements *%?? m ««d«s » the high partial cooceotrauon of 

have been previously determined by Olvin and coworker* * tKt cDNA " pec ?* a ^ lble 10 10 microliter hybrid- 

(Riles, ct aU »ilowing for the automatic generation of the ^ U0 ° ^ P* nul CDDCtnIriQ00 *»ows *« 

color karyotype shown in FIG. 7. Tne color of a enromo- dcte ff° of ™ l ™"5? tt . Wlthmtl * e for PCR 

somal section oo the karyotype corresponds to the color of ampilfical,0D of * e byondoauoo probe which may bias tbe 

tbe amy element containing the dooe from that section. Tne ™ ***** * cDNAspede*. 

black regions of the karyotype represent false negative dark 25 ft* 1 * express** >n Kiudie* Mich a* the* can be uxaJ fnr 

spots on tbe array (10%) or regions of the genome not genomics research to discover which genes are expressed in 

covered by the Olson clone horary (90%). Note that the six which cell types, disease states, development states or 

largest chromosomes are mainly red while tbe ten smallest environmental cooditions. Geoe expression studies can also 

chromosomes arc mainly green, thus matching the original ** used for diagnosis of disease by empirically correlating 

CHEF gel isolation of the hybridization probe. Areas of the K *eoe expression pattens to disease states, 
red chromosomes containing green spots and vice -versa are 

probably due to spurious sample tracking errors in the EXAMPLE 3 
formation of tbe original library and in the amplification and 

spotting procedures. ^ Multiplexed Colorimeiric Hybridi2aiioo oo a 

Tbe yeast genome arrays have also been probed with " G ridded Solid Support 
individual clones or pools of clones that are fluorescent] y 

Labeled for pbysical mapping purposes. The hybridization A * be * t of P*" 1 *^^ nitrocellulose was gridded with 

signals of these clones to tbe array were translated into D4mw cictncal * »*d c from silicone rubber according to the 

pmiiior* on the physical map of the yeast genome descripiioD in Seaion JV-A. The sheet was soaked in 

lOxSSC and allowed to dry. As shown in FIG. 12, 192 M13 

EXAMF1£ 2 clones, each with a different yeast inserts were arrayed 400 

miiTtmx apart in fnur quadrants of the solid Mipriirt using the 

Total cDNA Hybridized to Micro Arrays of cDNA automated device described in Section 111. Toe bottom left 

Qones with Two* Color Fluorescent Detection quad not served as a negaove control for hybridization, 

- c . „ . . . -n. . . , . 4 * while each of the other three quadrants was bvbridizeri 

Twenty-four clones containing cDNA inserts from the _ . , . . ,. a , , . 7 

, A \. , rt -i . v .n t • . r\ simultaneous W wnb a different oligonucleotide usinz tbe 

plant Anbidopsis were amplified using HCR. Sill was added , ' , , ... u » w " uu " BOC ™t « 

f" . T_ „ ? , _ opeo-tacc bybndiiaUon lecnooloev desenbed in Section 

to the purified PCR products to a final concentration of tCa tk, «^ —a ^ „ 4 l r ^ UOD 

^ --- r — . i , .... Iv-a. Tbe Orst rwo and last four elements of each arrav are 

SxSSC. The cDNA clone* were stmned on polv-Hwine • . f , . /~ ^ arr 

. • ■ n f- * . . posjtrve conuols for tbe caJorimcuic detccuoo stcn 

coated microscope slides m a manner similar to Example 1. 50 ^ 

Among the cDNA clones was a clone representing a tran* Tb c oligonucleotides were labeled with fluorescein, 

scription factor HAJ4, which had previously been used to W0jcb wu deteaed using an anii-fluorescein antibody cod- 

create a transgenic line of tbe plant Arabidopsis, in which jugaied to alkaline phosphatase that precipitated an NBT/ 

this gene is present at ten times the level found in wild-rype BGP dvc 00 tDC 50 ^ d support (Amersbam). Perfect matcbes 

Anbidopsis (Schena, et at, 1992). 35 oetween tbe Ubeicd oligos and the M13 clones resulted in 

Total poly-A mRNA from wfld type Arabidopsis wis d * rk Vttibic « ^ D^cd eye and detected using an 

**»Uu*Ju«ngiu«Hlardmelh^ optical scaoner (HP ScanJet II) attached to a personal 

reverse transcribed into total cDNA, »" "g a fluorescein computer. Tbe hybridization patterns are different in every 

nucleotide analog to label the cDNA product (green ^drani indicating that each oligo fouod several unique 

fluorescence). A similar procedure was performed with the tu M13 doDC * lnjm ****** ^ 192 wilb • P^" 1 ^M«ote 

traiagemclmeofArabidopsiswbcTemetranscripiw^ mitcb Note thai the open capillary printing tip leaves 

HAM was inserted into the genome using standard gene detectable dimples 00 the nitrocellulose which can be used 

transfer protocols. cDNA copies of mRNA from the traos- 10 »««>nmicaUy align and analyze tbe images, 

gcoic plant arc labeled with a lissamine nucleotide analog Although the invention has been described with respect to 

(red fluorescence). Two micrograms of tbe cDNA products 65 specific embodiments and methods, it will be dear that 

from each type of plant were pooled together and hybridized various changes and modificaboo may be made without 

to tbe cDNA clone amy in a 10 microliter hybridization departing from toe invention. 
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WecUim: 

L A method of ionmog i microarray of discrete aoalyte- 
assay region* oo a solid support, wberc each discrete region 
in toe microarray bis • selected, usiyte*cpccific reagenu 
uid method co mprisin g . 

(a) loading to aqueous solution of a srlrcud aaaJyu* 
specific reagent m a reageni<4ispefising device having 
an elongate capillary channel adapted tn hnid a quantity 
of (be reagent solution aod having a tip region ai wfaicb 
the solution is the channel forms a Qinnscuay 

(b) tapping the tip of the dispensing device against a solid 
support at a defined position oo (be surface, with an 
impulse effective to break the menisevs in the capillary 
rfainnel and deposit a selected volume between 0.002 
and 2 &1 of solution oo the surface, and 

(c) repeating steps (a) and (b) until said microarray is 
formed* 

2. The method of claim i, wherein the reagents used to 
form the discrete regions in the microairay are distinct 
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ouckic acid strands and wherein steps (a) and (b) arc 
repealed until the microarny has about 100 or more discrete 
regions of distinct nudeie acid strands per cm 3 of solid 
support 

3. Toe method of claim L wherein the reagents used to 
form the discrete regions is the microairay arc distinct 
nucleic acid strand* and wherein steps (a) and (b) ere 
repeated until the microairay has about 2000 or more 
discrete regions of distinct nucleic acid strands per cur of 
solid support. 

4. The method of claim 2, wherein the channel is open- 
sided. 

5. Tbe method of claim 3, wherein (he channel is open- 
sided. 

4. Tbe method of claim 4, wherein tbe volume is between 
OjOCB and 0.25 nl 

7. Tbe method of claim 5, wherein the volume is between 
0.002 and 0 25 nl 
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Methods 

Generation of micruarray*. hybridira.i.m. jca.ini,,, The 
prepara....,, „! eua.ed mi,,,,,,,,,,, ,, klc> , nu| Ml) n|| 

r r..,i.„M ,„ „s. A w<ls wrrirtl „ (1| m a nuniu . i %|injj u ( i 

deser.lvd . Hricflv. pre-eleaned N |ide* were .rea.ed nnh 
|*.lyl -lys.ne «.!„....„ iS.^mai ... lorn, an adhes.ve Mir ,-. KV ,„ r 
pnimn K . I'CK products, grilled In e.ha.ml ,.„rifiwi„.„. ,vere 
res,.>pended ..< .Vv SSC A c..>h.m tu.il. arravinc mh„ „ iAn | 
a ! U Cl1 smM *»*••««. t- 3 nannhursi „. UNA on.., 

the sl.des. Al.er priming the slide «vrc washrt | ,„ „ „.„ S| )S 
sol,,,,,,,,. |h, ri . nuiI , Illi: ,„, mil , |)NA wj% l|nl |UllVt| |iv su(i 
nn-rpns: the dido in V? «C drilled » au . r ,,„ : lmn lllll(mVl| ,, v 

Ime. uash «,.!, y.vw.lu.,..!. DNA was I'V en.sslinked „ 

sl.des Mrataeene S.ra.alinLer. mi ml). 1„ , WVt . nl l ,„ n . N ,, cill , 
pr-.be l."Hl.n K . Ik- slides were blocked In ,in„n, ,„ , >.,U„„,n 
"I 7.1 „ lM MK , lnj , ;mhvi1r i 4ll . t )i».,Kod .., 0.1 ,\|'h.,ru . Kil) ,, M 
k.0 cnnu.n.nj: y,%, l-mcihylO-pmolidmonc .Aldrich) 
Add.t.onal protocols ,,ul par,, l, M pcr.ainmv: „, m.croarrav 
r..hr,ca..o„ can he ..hl.Hncd Iron, h„p : ,/cm,:n,.s,a.,rnrd.cd„/ 
phrmvn. 

Hnrilicd. laMlcd cDNA was rrsuspended in I I M | „l 1 Sx SSI 
ennMnnnj; 4 w ,„ ,„,,, uUr |)NA ,_. f _ ^ jR " ■ ■ • 

< I I >NA (I iilH,. IlKU. ;„ U | o.3 pi .„ I.r:.. SI ).v l . ri ,,r ,., 
byhnd,,.,.,,,., .Ik- M.l„„.,n |,. (i | a , j,„ : mm |n 

w , , jrrirt , ,„„ .,, 

- , C ; M m W»r .» H-jnninu. slide nv,e 

?! .'" « s< " Sl * '" r 5 mi " " :v s * : ,;,r 1 '"»• 

ljM ' r mur lM.il. In V Mln.h 'wi.h'^u^.v HTMuii 



' f H ' ,M, : ,k ' VJ "- '^"'S "H' Jl'pn-1'r.a.e exei.a.ior. line, 

1 " ,u " "". w ' h «.re> mh-J. Da.;, nas col- 

• >! ik'pil! '"•' X '"""" U ' S " 1 ' " y •'"^•...s/pixel will, I J bin 

Hn.bc rrepara.i,.., a.,., labelli,,,;. K\A ua. exiraced •>.,„, «IK 

" M "f rl '• ,!:l ' , " ,m ft "m:'lK-n,ani,lac. U rl 

« > ilireeiiiin> el >NA P ...l,,, HW »v m hc»./eJ from s.nplyohj-o 

d euul (Plurnueu. inKNA ,|,. Huore^-nllv laWled 

1 » 'M"e,,.,red lmn, n,l!NA by „lij.„ dT-prinwl ,H,lvn,er- 
./..,„.„ Sujvr.Vr.p. II rcwrn- ..a„>cn P ,a>e (l.TI Inc). 
^ ,,.0 0 ,,,,^,,, ,,, in llK . | 4 | H .|| ln p rcjc , waN „ 

•«■ ■• JAI. and dl.r.. and n,M dni'. Fbn.rescen. 
n.iJ^.iKlo. Klnulannnc III. dL'TI' (l\-,ki„ F.lmer Ceius) or 
t-v.WIU I' (Ann-rslum.. wore pwnl ;1 , „ , , 1)M p,,,,^ ^ 
pur.ned In ^1 chr..„,ai.. ;: rapl,y (|l,„Spi„ h/H HI | ljc |, and 

Scleetion of cDNA elcn,cn. 4 and genera.ion of control tern- 
Pla.es .Sv,,,I,,,k eDNAs ncrc prepared by < |..„.„ g random 
/I..///HI and Hmdlll ended Iraumenis ol E. .../, DNA in the v«- 
«..r I'.M'W ,u,ly (AC (IVonu-ya). I.near./.ng isolated platmid 
' ' v " h htMi "-"'Iicxi/uhj. p.,l v (A)' taiU-d UNA com- 
I'lenun.ary t„ ,| H . ...wrl I,..,,, ,|ic .e,ide.,l SI't, promoter 
IIV.nK-aU',,.,, .....se.llK Mn.boi/ed H NAs w ere seleeted on 
-■I'j:.. d I eelli.l..se. Tin- lar^e,. enmp nl'eDNAs e.Mwistcd „f f, 74 
eli.VA elone l,.„„ ti u . | mi; Jrr . lu ,| n „ n „., U ,^ inijnt hrajn 

I','" -1 " '•• ' -.erv MB 

hbrar, ...cnlv. ih.n .....esi^.uled m a named jwne acciirding 
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Fig. 3 Nonnern nyorioaaiion suosumiai* 
ing the consistency of the cDNA mcroar- 
ray results. Corresponding locauons witrwi 
the cDNA rmcroarray illustrated in Fig. 2j 
are provioed tor 1) Wa/- Up21: 2) MARCKS: 
3) coUaoenase: 4) MCAF/MCP-1: 5) o-U 
antichymotrypsin: and 6) 0 -aenn. The sig- 
nal detected by a radio- labelled frsctm 
prooe represents a control tor loading vari- 
ance, with a red/green ratio observed on 
the cDNA rmcroarray (Fig. 2a.c) tor 0»acrm 
0(1.04. 
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in the UniCcne EST clustering sys- 
tem- 1 "*. The second largest jinuip of 
clones consisicil at 1 M> wqucnccJ cDNA 
clones vicncraicil by Mibiraction of cDNA 
from I he chmmmiirmvft suppressed 
mm-iumorigenic UACC-V03 (*M cell 
line with cDNA from its parental ttimori- 
genie cell line UACC-W (rd. 9). 
Approximately 100 additional ^enes 
(total K70 genes arrayed) were obtained 
from KST libraries on the basis of their 
n-l'Antichymotrypsin expression pattern t tissue specific, and so 
on), huh array included the following 
hybridization controls: plasm id vector, 
lambda. CX 1 74 phage, total human I >NA. 
human Col I l>NA, and poly (A)*. The 
synthetic siamlards used for normaliza- 
tion of signals in each wavelength were 
also arrayed, (iontrols were included in 
each quadrant of the array to assess the reproducibility of the 
hvbridi/ation signal. Two plates of cMNA clones (derived from 
the UAllOWj subtracted library! were also arrayed in dupli- 
cate. Tidelity of the t'nigene array relative lo dhKST was tested 
by sequencing of a random sample ol II clones used for 
mic ro.irr.lv cttnstruciion. All sequeiue> were identical with the 
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cnrres|>mdmg dhLST %'tiiricN. Additionally *\un muui,iti.« c ,! 
cDNA from the tAU.-SUj subtraciol library was sequenced. \ 
listing ol vl>NAs comprising this muroarrav nhuh u vU 
derived I mm the Unigene and 'housekeeping panel *.m lv 
obtained from hnp://v*wv.nih.gt»v/PIH/LCr./AUKA^e\pn • 
html. 

Northern blot analysis. Total KNA, 1(1 ug per lane. wj\ elec- 
trnphoresed in 1.2% agarose- forma Idchvde gels and transferred 
onto nylon membrane (Hybond-N*. Amcrshaml by capillar* 
blotting overnight. Hor DNA prolx** insert fragments fiom the 
Snares I NIK cDNA library'" were obtained by vector I'CR for 
p21, MAKtlKS, (i- l-antichymotrypsin and fj-jcnn. Hiobes tor 
fibrobbsi coilagenase and M("Ar*7MCP- 1 were isolated from a 
UACC-yO.M+h) enriched cDNA lihr.tr\ v with all probes 
labelled by random priming. Milters were washed to a stun- 
genq* of 0.1 x SSC at A2 *C tor 20 min. 

Web sites, http://cmgm.stanford.edu/pbiown for protocols and 
parts list pertaining to microarray fabrication. 
http://www.nchgr.nih.go\7l )l K/|.( ".(*,/ A UK AY /ex pn. html for a 
listing til cDNAs comprising this microarray which- weie 
derived from the Unigene and 'housi'keepmg' panel. 

Acknowledgements 

Work in l'( >.lO liibimuory i> suppottt'il it) futrt by the Houitnt 
Hughes SUuluiil hisiiiuw ttiul Motioiutl Center tor Hunutn 
Genome UcH'tm h (H( »'(*W5f/J. We wwhl like to ttekmnvle$l$e the 
exeellent teehnieohiml ynrp/»V thsistninv of'X. He, 7." Holnuimu V. 
lituiyt.J. ietulers. II ij'hittihl H. Wulker. I.I), hvij stipportetl by 
NIH $r t wt JVX'HAIU/.'JVi-J/. /'.(). /{. iy tin ,t»t>hint tmvyiipitor 
oftlwllnwtthl lhi]:he> M* tin ttl hiytiinte. 



Received 15 October accepted 8 November, 1995. 



V Vogelitetn. 6 & Kmzier. K.W. The mutttstep nature ot cancer. Trend* 
Gene'. 9. 138-141 (1993). 

2. VVemberg. RA The molecular oasts ol oncogenes ana tumor 
suppressor genes. Ann. rVV Acad. So 758. 331-338 0995) 

3. Levtne. A J. The tumor suppressor genes. A/inu P*v Srocnem. 62. 
623-651 (1993). 

4. Trent. J.M. er *t. Tumongentcrty m human melanoma cell hnet 
controlled by introduction ot human chromosome 6. Science 247. 
568-571 (1990). 

5. Su. Y ef #/ Reversion of monochromosome- mediated suppression ot 
tu m pngeniciiy m malignant melanoma by retroviral transduction. 
Cancer Pes. 56. 3186-3191 (1996). 

6. Scnena. M.. Shaion. D.. Dam. R.W.. 6 8rown. P.O. Ouantttativc 
monnonng ol gene expression pan ems with a complementary ON A 
microarray. Science 270. 467-470 11995). 

7. Shaion, 0-. Smrtn. S.J. & Brown. P.O. A DNA microarray system tor 
analyzing compiei DNA samples usmg iwo-co*or fluorescent proDe 
hyondoation. Genome Aes. 6. 639-645 (1996). 

6. Scnena. M. er a/. Parallel human genome analysis: microarray oased 
expression of 1000 genes. Proc Maff. Acatf. So. USA 93,10539-1 12B6 
(1996). 

9. Aay. M.E.. Su. Y.A.. Meftzer. P.S. 4 Trent. J.M. Isolation and 
characterization ol genes associated with chro m o s ome 6 mediated 
tumor suppression w\ human malignant melanoma. Oncogene 12. 
2527.2533(1996). 

10. Soem. M B. et a/. Construction and characterization ol a normalized 
cDNA Horary. Proc. tot/. Acatf So. USA 91 . 9228-9232 (1994). 

1 1 . Boguski. M.S. & Schuter, CD. ESTabhshmg a human trensenpt map 
Nsujn Genet. 10. 369-371 (1995). 

12. VHayaaaradhi. S.. Doshocn. P.M.. Woicnok. j. I Houghton. A.N. 
Mel an ocyte drtterentiatcn mamer gp75. the brown locus protem. can 
be regulated tnoependentty ot ryrosmast and pigmentation. J. invest. 
Orrmarcv. 105. 1 13-1 19 (1995). 



13. Viiayasaradhi. S.. Xu. Y., Bouchard. B & Houghton. A.N intracellular 
sorting and target mg ol meianosomai memorane protemt: 
toenittcaiion ot signals tor sonmg ot the human brown locus prote*n. 
gp 75 J invest. Oe^njfoJ. 130. 807-820 (1995), 

u. Nauo. J. ef */. Expression of proteoiiptd protein gene is dvectry 
associated with secretion ol a tactor mliuenctng oiigooendroyte 
Development. J Neurochem. 6455. 2396-2403 (1995). 

15. Graves. D.T.. Barnhiti, P.. Galanopouios. T. & Antorwadts. H.N. 
Expression of monocyte chemotacitc protein- 1 in human mela n oma m 
vivo. Am. J. PMlftol. 140. 9-14 (1992J 

16. Krrsiensen. M S.. Deieuran. B.W.. Larsen. CO.. Thestrup*Pedersen. K. 
& Paiudan. K. Expression ol monocyte chemotactic and activating 
tactor IMCAF) m mm related cells. A comparative study. Cyto*me 5. 
520-524(1993). 

17. Huang. S.. Xte. K. Smgn, R K.. Gutman. M. & Bar-Eli. M. Suppression of 
tumor growth ano metastasis ol munne renal adenocarcinoma by 
syngeneic tiOrobiasts genetically engineered to secrete the JE/MCP-1 
cytokine. J. interferon Cytokine Pel. 15. 655-665 (1995). 

IB. El- Deny W.S er el WAFl. a potential meoiator ol p53 tumor 
suppression. Cell 75. 817-825 (1993). 

19. Mtete. M E er a/ Metastasis suppressed, but tumo n genicrty and local 
invasiveness unattecied. m the human melanoma cell tme MeUuSo 
•her introduction of human chromosomes 1 or 6. Mo/. Canzinog. 15. 
264-299(1996). 

20. Jiang. H. er a/. The melanoma differential ion>associated oene mda-6. 
when encodes the cycim*oepenoent kmase mhtbitor p2l . rs 
diflerentairy expressed dunng growth, differentiation and progression 
m human melanoma cells. Oncogene 10 1855-1864 (1995). 

21 . Scnuier. G.O. el a'. A gene map ol the human genome. Science 274. 
540-546(1996). 

22. Lennon, G.. Autlray. C. Porymeropoulos. M. & Soares. M B. The 
I M A G E Consortium an integrated molecular analysis of genomes 
and tnetr expression. Genomics 33. 151-152(1996). 



058-0051 




July 1996 



IBD Mapping in Livestock 

Sequence of 500-kb 
Rhiicbium Replicon 

Human Y Chromosome 
Haplorypes 

BAC Mapping of 
Exrrachrcmcsomal Srrucrure 

DNA Microarray System 



RCH 



Volume 6 Number 7 



INCLUDING 




Cold Spring Harbor 
Laboratory Press 



fl 




This Page Blank (uspto) 



GENOME METHODS "2°* This ms*.*.* m 2v be 

A DNA Microarray System for Analyzing " 
Complex DNA Samples Using Two-color 
Fluorescent Probe Hybridization 

Dari Shalon, 14 Stephen J. Smith, 3 and Patrick O. Brown 1 AS 

1 Howard Hughes Medical Institute and Departments of ^Biochemistry and *Molecular and Cellular 
Physiology, Stanford University, Stanford, California 94305 

Detecting and determining the relative abundance of diverse individual sequences in complex DNA samples is 
a recurring experimental challenge in analyzing genomes. We describe a general experimental approach to 
this problem, using microscopic arrays of DNA fragments on glass substrates for differential hybridization 
analysis of fluorescently labeled DNA samples. To test the system, 864 physically mapped X clones of yeast 
genomic DNA, together representing >7S% of the yeast genome, were arranged into L8-cm x L8<m arrays, 
each containing a total of 1744 elements. The microarrays were characterized by simultaneous hybridization 
of two different sets of isolated yeast chromosomes labeled with two different fluorophores. A laser 
fluorescent scanner was used to detect the hybridization signals from the two fluorophores. The results 
demonstrate the utility of DNA microarrays in the analysis of complex DNA samples. This system should 
find numerous applications in genome-wide genetic mapping, physical mapping, and gene expression studies. 



Many problems in genome analysis depend on 
determining what specific sequences are repre- 
sented in a complex DNA or RNA sample and at 
what abundance, for example, what genes are 
represented in a specific chromosome band or 
YAC clone, what intervals are amplified or de- 
leted in a particular cancer cell, or what genes are 
expressed in specific cells under specific condi- 
tions. As a general approach to this problem, we 
have developed a system for making microarrays 
of DNA samples on glass substrates, probing 
them by hybridization with complex fluorescent- 
labeled probes, and using a laser-scanning miCTO- 
scope to detect the fluorescent signals represent- 
ing hybridization. Fluorescent labeling allows for 
simultaneous hybridization and separate detec- 
tion of the hybridization signal from two or more 
probes. This in turn allows very accurate and re- 
liable measurement of the relative abundance of 
specific sequences in two complex samples. 

RESULTS 

Array Hybridization Pattern 

Figure 1 shows the two-color fluorescent scan of 
a yeast genomic array following hybridization 

**r*%tnt address: Synttnl tnc, Pale Aho. California 94 SOS. 
'Corresponding author. 

{•MAIL pbrowntcmgm. Stanford. tdu, http://cmgm. 
itanford.edu/pbrown; FAX (41 S) 72S-1399. 



with a mixed probe consisting of llssamine- 
labeled DNA from the 6 largest yeast chromo- 
somes together with fluorescein-labeled DNA 
from the 10 smallest yeast chromosomes. A red 
color indicates that yeast sequences present in 
the lissamine-labeied hybridization probe hy- 
bridized to an array element. A yellow-green 
color indicates that yeast sequences present in 
the fluorescein-labeled hybridization probe hy- 
bridized to an array element. An orange color in- 
dicates cross-hybridization of both chromosome 
pools to an array element (e.g., dispersed repeti- 
tive elements, such as Tyl elements). 

Each clone was spotted twice, resulting in du- 
plicate hybridization patterns in adjacent quad- 
rants of the array. Control DNA spots, which 
were randomly amplified in the same manner as 
the X clone array elements, are located in the bot- 
tom corner of each quadrant. "A" points to a pair 
of spots containing total yeast genomic DNA. 
These spots appear orange because both chromo- 
some pools hybridized to yeast genomic DNA. 
The negative controls are as follows: "B" points 
to a pair of spots of wild-type X DNA, "C" points 
to a pair of human genomic DNA spots, and "D" 
points to a pair of 6X174 DNA spots. The lack of 
a hybridization signal at these three negative 
control spots indicates that the hybridization was 
specific for yeast sequences. 
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Figure 1 Two-color fluorescent scan of a 1 .8-cm x 1 .8-cm yeast array 
of X clones of yeast genomic DNA. The ONA spots are spaced at a 
distance of 380 from center to center. A probe mixture consisting of 
DNA from the 6 largest yeast chromosomes (4, 7, 1 2, 1 3, 1 5, 1 6) labeled 
with lissamine (red dots) and DNA from the 10 smallest yeast chromo- 
somes (1, 2, 3, 5, 6, 8, 9, 10, 11, 14) labeled with fluorescein (yellow- 
green dots) was hybridized to the array. A pair of yeast genomic DNA 
spots (A) served as a positive control. The three negative controls are k 
DNA (S), human genomic DNA (Q, and 6X174 DNA (D). 



Karyotype Depiction of the Array Hybridization 
Pattern 



The inserts contained in the arrayed A clones 
have been mapped physically (Riles et al. 1993). 
The clones are arrayed in a random but known 
order on the array. Therefore, using the identity 
of each clone along with its physical map infor- 
mation, the pattern of hybridization to the yeast 
array can be represented in the form of a karyo- 
type of the yeast genome, as shown in Figure 2. 
The color of any segment of the ideogram repre- 
senting an individual chromosome on the karyo- 
type is directly determined by the ratio of red and 
green hybridization signals at the array positions 
of the corresponding clones. The lengths of the 
discrete colored segments of each chromosome 
correspond to the physical lengths of the yeast 



inserts. The chromosome seg- 
ments colored black represent ei- 
ther intervals of the genome that 
are not represented by clones in 
the library (90%i or false-negative 
hybridization signals on the array 
(10%t. Most of. these false nega- 
tives are attributable to failures of 
the PCR amplification of the x 
clones, though occasional failures 
of the arraying process or nonuni- 
form surface preparation could ac- 
count for a small fraction of the 
false-negative signals. The large 
gap on chromosome 12 is the re- 
gion coding for ribosomal DNA 
that was not represented among 
the arrayed clones. Genomic inter- 
vals represented, by overlapping 
clones were assigned a color based 
on the hybridization signals of 
only one of the overlapping 
clones, chosen at random. 

Note that in this representa- 
tion of a yeast karyotype, the larg- 
est six chromosomes are mainly 
colored red. This indicates, that 
most of the arrayed clones that 
were mapped previously to these 
six large chromosomes hybridized 
primarily to the lissamine-labeled 
probe prepared from the corre- 
sponding purified chromosomes. 
Conversely, the smallest 10 chro- 
mosomes are mainly colored green 
in this image, matching the origi- 
nal CHEF gel isolation of the chro- 
mosomes used as the hybridization probe. The 
experiment was repeated with the yeast genome 
split into six discrete chromosome pools contain- 
ing 2-4 chromosomes per pool using CHEF gel 
electrophoresis. The chromosomes in each pool 
were extracted from the gel, amplified, and fluo- 
rescently labeled. The six chromosome pools 
were hybridized to six separate yeast arrays. 
Forry-four \ clones gave a positive hybridization 
signal on all six arrays indicating that they con- 
tain yeast repetitive sequences (data not shown). 
These 44 clones and 10 clones with very weak 
hybridization signals were not included in the 
data set used to produce this karyotype. 

There were -40 anomalous clones, which ap- 
pear in this karyotype representation as green 
bands on the otherwise red chromosomes or red 
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Figure 2 Computer-generated ideogram repre- 
senting a karyotype of 5. cerevisioe, based on the 
normalized hybridization signals from the array 
shown in Fig. 1. Note that the 6 largest chromo- 
somes are mainly red and the 10 smallest chromo- 
somes are mainly green. Black stripes represent in- 
tervals not represented by clones in the array or for 
which the corresponding clones gave false-negative 
hybridization signals. 

bands on the otherwise green chromosomes. 
Four randomly chosen examples of these anoma- 
lous clones were analyzed by hybridizing the 
clones to vertical strips cut from a Southern blot 
of CHEF gel-separated yeast chromosomes. In 
each case, the hybridization patterns of the 
anomalous clones corroborated the chromo- 
somal locations assigned by the microarray hy- 
bridization results (data not shown). Two clones 
that were thought to map to the 10 smallest chro- 
mosomes were found to hybridize preferentially 
to the probe representing the 6 largest chromo- 
somes and thus appear as anomalous red bands 
on the karyotype. Both hybridized to one of the 
six largest chromosomes on the Southern blot. 
Similarly, two clones that appear as anomalous 
green bands on the karyotype were found to hy- 
bridize to one of the 1 0 smallest chromosomes on 
the Southern blot. Thus, the anomalous clones 
are probably the result of sample tracking errors 
or, possibly, of errors in the published restriction- 
digest-based physical map on which the karyo- 
type representation was based (Riles et al. 1993). 

DISCUSSION 

The DNA microarray hybridization system re- 
ported here is conceptually and functionally 
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similar to fluorescent in situ hybridization (FISH) 
to metaphase chromosomes, with three impor- 
tant differences. First, the target elements of the 
microarrays can, in principle, be anv length or 
composition, from megabase YAC clones or mi- 
cTodissected chromosome bands to individual 
cDNA clones, to short oligonucleotides. This ver- 
satility allows the user to choose characteristics, 
such as the mapping resolution and genetic conv 
plexiry of each array element, to suit a particular 
application. Second, the hybridization signals are 
locaUzed to discrete elements of known size and 
location, making them easier to identify and 
quantitate than the hybridization signals from 
irregularly shaped metaphase spreads. Third, mi- 
croarrays are more consistent and potentially 
amenable to automated production, hybridiza- 
tion, and data analysis than metaphase spreads. 

Arrays of DNA samples on porous mem- 
branes, for example, dot blots/have long been 
used as a basic tool in molecular biology. Dot- 
blot membranes are usually at least 8 x 12 cm in 
size, require the use of milliliter volumes of hy- 
bridization solution, and are limited, owing to 
autofluorescence and scattering, to radioactive, 
chemiluminescem, and colorimetric hybridiza- 
tion detection methods (Ross et al. 1992). Micro- 
arrays made on glass surfaces, on the other hand, 
can be mass-produced and are comparatively in- 
expensive, convenient, and compatible with 
fluorescent hybridization detection methods. 
Furthermore, a glass surface, when appropriately 
treated, has very low nonspecific binding of la- 
beled hybridization probes, resulting in Jower 
backgrounds than are encountered typically with 
porous membranes. For hybridizations with very 
complex probes, the concentration of the labeled 
probe DNA is a limiting factor in the sensitivity 
of the assay. Minimizing the volume of the probe 
solution in a hybridization, by restricting the tar- 
get to a small area and by using a nonporous 
substrate, makes it practical to achieve very high 
probe concentrations. 

One important advantage of fluorescently la- 
beled probes is that, unlike most radioactive and 
chemiluminescem signals, fluorescent signals do 
not disperse and therefore allow for very dense 
array spacing. A unique, and probably the most 
important, advantage of fluorescent probes is 
that the hybridization signals from two or more 
differently labeled probes hybridized to the same 
target element can be detected separately. In this 
way, two-color hybridization detection allows for 
a direct and quantitative comparison of the 
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abundance of specific sequences between two 
probe mixtures that are hybridized competitively 
to a single array. The absolute intensity of a hy. 
bridization signal at a particular element in an 
array can vary owing to experimental factors 
such as variations in the amount of DNA depos- 
ited on the array, variations in the hybridization 
or wash conditions between experiments, or 
variations in the hybridization characteristics of 
the different DNA sequences on the array. The 
ratio of the two signals at any element in an ar- 
ray, however, is relatively insensitive to these 
confounding factors because they affect both 
probe mixtures equivalendy. This ratio therefore 
accurately reflects the relative abundance of the 
cognate sequence in the two probe samples. This 
is the principle underlying the technique of com- 
parative genomic hybridization (CGH) f which is 
used to detect changes in the copy number of 
specific chromosomes or chromosomal regions 
(Kallioniemi et al. 1992). CGH is based on mea- 
suring the relative fluorescent hybridization in- 
tensities of two genomic-complexity hybridiza- 
tion probes, for example, probes representing ge- 
nomic DNA from normal and affected tissue 
samples, which are labeled with two distinct fluo- 
rophores and hybridized simultaneously to a 
metaphase spread. DNA microarray representa- 
tions of the human genome may provide a more 
convenient and higher resolution alternative to 
metaphase chromosomes for CGH. 

Cross-hybridization between related se- 
quences is an important problem faced by any 
hybridization-based assay, including the DNA 
microarray assay described here. Studies are now 
in progress to quantitate the extent of cross- 
hybridization between related sequences of vary- 
ing homology and length, in DNA microanay 
hybridizations. The stringency of hybridization 
and washing can be controlled by varying the salt 
concentration and temperature as in conven- 
tional membrane-based hybridizations. Cross- 
hybridization caused by repetitive sequences can 
be minimized by prehybridization of the probe or 
array with vast excess of unlabeled copies of the 
repetitive sequences. 

Alternative methods have been described for 
making microarrays of very short DNA se- 
quences, involving photolithography (Pease et 
al. 1994) or physical masking (Maskos and South- 
ern 1992) methods. These in situ synthesis meth- 
ods are inherently limited to low complexity ar- 
ray elements consisting of oligonucleotides. For 
complex-probe hybridizations, the specificity of 



hybridization is improved by using DNA frag, 
ments substantially longer than oiigonucleo- 
tides. Moreover, the in situ synthesis approaches 
to array fabrication depend on prior knowledge 
of the sequence to be recognized by each array 
element The approach described here makes mi- 
croarrays by transferring tiny volumes of DNA 
samples from microwell storage plates to a solid 
substrate. Thus, nucleic acids (or other mol- 
ecules) of virtually any length or any origin can 
be arrayed, and knowledge of their sequences is 
not required. 

The arrays used in these experiments do not 
represent the maximal achievable density of ele- 
ments. We have found that the spacing between 
the spots can be decreased by shrinking the con- 
tact area of the printing tip and by increasing the 
hydrophobicity of the glass surface. Microarrays 
with 100-M.m feature size have been tested suc- 
cessfully in pilot experiments (data not shown). 
Assuming the projected availability of the appro- 
priate physically mapped human genomic clones 
(Hudson et al. 1995), arrays at 100-M.m spacing 
would allow for 10,000 discrete intervals of the 
human genome to be represented in a 1-cm 2 ar- 
ray. Such an array could be used for mapping at a 
resolution of <0.5 Mb. Experiments are in 
progress to explore the feasibility of such arrays. 

Our initial motivation for developing these 
microarrays arose from the need for abundant 
and inexpensive genomic arrays for genomic 
mismatch scanning (GMS) (Nelson et al. 1993), a 
method of genetic linkage analysis based on 
identification of the regions of "identity by de- 
scent" between affected relative pairs using a 
single complex-probe hybridization to an array 
of genomic clones. Experiments using these ar- 
rays to map quantitative trait loci in yeast by 
GMS are currently in progress (J. deRisi, D. Lash- 
kari, L Penland, L. McAllister, J. McCusker, R. 
Davis, and P.O. Brown, unpubl.). 

Microarrays of cDNA clones, prepared using 
the system described here, have been used for 
quantitative monitoring of gene expression pat- 
terns in Arabidopsis (Schena et al. 1995), S. cerevi- 
siae (D. Lashkari, J. deRisi, L. Penland, P.O. 
Brown, and R. Davis, unpubl.), and human tis- 
sues Q. deRisi, M. Bittner, P. Meltzer, L Penland, 
J. Trent, and and P.O. Brown, unpubl.). We an- 
ticipate that DNA microarrays of the kind de- 
scribed here will be useful in additional applica- 
tions for which conventional dot blots, high- 
density gridded arrays on porous membranes, or 
FISH are currently used. These potential applica- 
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tions include comparative genomic hybridiza- 
tion (Kallioniemi et al. 1992), sequencing by hy- 
bridization (Drmanac et al. 1993), physical map- 
ping of cloned ox amplified sequences (Billings et 
al. 1991 j, and economical distribution of re- 
agents for integrated genetic and physical map- 
ping based on a common set of arrayed clones 
(Zehetner and Lehrach 1994). 

METHODS 

Amplification of Target DNA Elements 

The array elements were prepared from physically mapped 
x clones (Riles et al. 1993). The X clones were amplified 
using randomly primed polymerase chain reaction (PCR) 
based on published and unpublished protocols (Bohlander 
et al. 1992; S. Nelson, unpubl.). The phage tysates were 
amplified In < 10- mJ PCR reaction using 5 »m final concen- 
tration of primer A (GCTATCTTCA^GATCA>>INNNK), 
200 dNTPs. and 1 unit of Taq polymerase. Round A 
consisted of five cycles at 94*C for 1 min, 25 - C for 1.5 min, 
25-72*C over 7 min. and 72*C tor 3 min using Taq poly- 
merase (BMBj. For round 6. the reaction volume was 
brought up to 100 mJ for a final concentration of 2 iim of 
primer B ( GCTATCTTCAAG ATCA 1, 200 \iu dNTPs, and 4 
units of Taq polymerase. Round B consisted of 30 cvdes of 
94 # C for 1 min, 56M for 2 min, and 7TC for 3 min. The 
amplification was performed in 96-well plates using crude 
phage iysates as the templates, resulting in an amplifica- 
bon of both the 35-kb k vector and the 5-kb to 15-kb yeast 
insert sequences as a distribution of PCR products between 
250 bp and 1500 bp in length. 

The PCR products were purified and transferred into 
TI (10 mn Tris, 1 mM EDTA at pH 8.0) buffer using Sepha- 
dex G50 gel filtration (Pharmacia) and evaporated to dry. 
ness at room temperature overnight. Each of the 864 am- 
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Figure 3 The layout of the arraying machine. All motions are under computer 
control. For more details of the arraying machine, sec web page http:// 
cmgm.stanford.edu/pbrown. 



plified x clones was rehydrated in 15 rI of 3* SSC 
(20 x SSC « 3 m SiQ. 0.3 u Na, titrate) in preparation for 
spotting onto the glass under normal room temperature 
conditions. 



Preparation of DNA Microarrays 

The microarrays were fabricated on poly-t-lysine coated 
microscope slides (Sigma). A custom-built arraying ma- 
chine, consisting of four tweezei-like printing tips 
mounted 9 mm apan on a computer-controlled robotic 
suge (Shalon 1996). loaded 1 of the concentrated PCR 
product directly from corresponding clusters of four wells 
of 96-well storage plates and deposited -5 nl of each 
sample onto each of 40 slides. Surface tension loaded the 
sample into the printing tip directly from the microwell 
plate and held the sample In the tip during the printing 
operation. Printing was achieved by lightlv tapping the tip 
against the glass surface. The open-capillarv design al- 
lowed for rapid rinsing and drying of the tips between 
samples. Figure 3 shows the layout of the arra\ing ma- 
chine. Figure 4 shows a detailed view of the four printing 
tips and the staggered printing pattern on the microscope 
slides. Adjacent samples were spotted 380 urn apart on the 
slides. After each set of four samples was printed onto 40 
slides, the printing tips were rinsed with a |et of water for 
2 sec and then dried by lowering the tips onto a sponge for 
2 sec. The process was repeated for all 864 samples and 
eight control spots. 

After the sporting operation was complete, the slides 
were rehydrated in a humid chamber at room temperature 
for 2 hr. baked in an 80*C vacuum oven for 2 nr. then 
nnsed in 0.1% sodium dodecyl sulfate (SDS> to remove 
unadsorbed DNA. To reduce nonspecific adsorption of the 
labeled hybridization probe to the poly-L-lysme coated 
glass surface, the slides were treated with succinic anhy- 
dride. One gram of succinic anhydride was dissolved in 
100 ml of l-methyl-2-pyTT0lidinone and then 100 ml of 
0.2 m boric acid (pH 8.0) was 
added. The arrays were soaked in 
this solution for 10 min and then 
rinsed in distilled water four 
times for 5 min each. Immedi- 
ately before use. the arrayed DNA 
elements were denatured by plac- 
ing the slide in distilled water at 
90*C for 2 min. 



Amplification and Labeling 
of Hybridization Probe 

The 16 chromosomes of Saceharo* 
myces cerevisiae were separated us- 
ing a contour-clamped homoge- 
neous electric field (CHEF) aga- 
rose gel apparatus (Bio-Rad) (Chu 
et al. 1986). The 6 largest chromo- 
somes were isolated in one gel 
slice and the smallest ten chro- 
mosomes in a second gel slice. 
The D\A from eaeh slice was re- 
covered using a gel extraction kit 
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Figure 4 A close-up view of the four open- 
capillary printing tips. The tips are 9 mm apart and 
fit into four adjacent wells of a standard microwell 
plate and print arrays in a staggered fashion on mi- 
croscope slides. For more details of the printing tips, 
see web page http://cmgm.stanford.edu/pbrown. 



(Qiagen) ind randomly amplified in a manner similar to 
that used in amplifying the target x clones (Grothues et al. 
1993). The main difference between this amplification 
procedure and the one used for the x array elements is a 
filtration step between rounds A and B to remove primer- 
dim en and the use of a random 9-mer 3* end on primer A. 
Following amplification, 2.5 M-g of each of the amplified 
chromosome pools were separately random-primer labeled 
using Klenow polymerase (Amersham) with a tissamine- 
conjugated nucleotide analog (DuPont NEN) for the pool 
containing the 6 largest chromosomes and with a fluores- 
cein-conjugated nucleotide analog (BMB) for the pool con- 
taining the smallest 10 chromosomes. The two fluores- 
cent-labeled pools were mixed and concentrated using an 
ultrafiltration device (Amicon). 



Hybridization 

Five micrograms of the hybridization probe, consisting of 
both chromosome pools in 7.5 mJ of TL was denatured in 
a boiling water bath and then snap-cooled on ice. Concen- 
trated hybndization solution (2.5 mJ) was added to a final 
concentration of 5x SSC/0.1% SDS. The entire 10 \d of 
probe solution was transferred to the amy surface, covered 
with a eoverslip, placed in a custom-buiit single-slide hu- 
midity chamber, and incubated in a 60*C water bath for 12 
hr. The custom-built waterproof slide chamber has a cavity 
just slightly bigger than a microscope slide and was kept at 
100% humidity internally by the addition of 2 nJ of water 
in a comer of the chamber. The slide was rinsed in 5 x 
SSC/0.1% SDS for S.min and then in 0.2 x SSC/0.1% SDS 
for 5 min. Ail rinses were at room temperature. The array, 
was then air dried, and a drop of antifade (Molecular 
Probes) was applied to the array under a 24-mm x 30-mm 
eoverslip in preparation for scanning. 



Detection and Analysis 

A custom-built laser scanner was used to detect the two- 



color fluorescence hybridization signals from i.g. 
cm x l.8-cm arrays at 20-ttm resolution. The glass sub- 
strate slide was mounted on a computer-controlled, two- 
axis translation stage fPM-500. Newpon. Irvine. CA) that 
scanned the anay over an upward-facing microscope ob- 
jective i20x , 0.75NA Fluor. Nikon. Melville. NY) in a hi- 
directional raster pattern. A water-cooled Argon/ Kjypton 
laser llnnova 70 Spectrum. Coherent. Palo Alto. CA). op- 
erated in multiline mode, allowed for simultaneous speci- 
men illumination at 488.0 run and 568.2 nm. These two 
lines were isolated by a 488/568 dual- band excitation filter 
(Chroma Technology, Brattleboro. VT1. An epifluores- 
cence configuration with a dual-band 488/568 primary 
beam splitter (Chroma) excited both fiuorophores simul- 
taneously and directed fluorescence emissions toward the 
two-channel detector. Emissions were split by a secondary 
dichroic mirror with a 565 transition wavelength onto two 
multialkali cathode photomuitiplier tubes (PMT; R928. 
Hamamatsu, Bridgewater. NJ), one with an HQ53S/50 
bandpass barrier fitter and the other with a D630/60 band- 
pass barrier filter (Chroma). Preamplified PMT signals were 
read into a personal computer using a 12-bit analog-to- 
digitai conversion board (RT1-834. Analog Devices. Nor- 
wood. MA), displayed in a graphics window, and stored to 
disk for further rendering and analysis. The back aperture 
of the 20 x objective was deliberately underfilled by the 
illuminating laser beam to produce a large-diameter illu- 
minating spot at the specimen (5-jim to 10-*m half- 
width). Stage scanning velocity was 100 mm/sec. and PMT 
signals were digitized at 100 usee intervals. Two successive 
readings were summed for each pixel, such that pixel spac- 
ing in the final image was 20 \un. Beam power at the 
specimen was -5 mW for each of the two lines. 

The scanned image was despeckled using a graphics 
program (Hijaak Graphics Suite) and then analyzed using 
a custom image gridding program that created a spread- 
sheet of the average red and green hybridization intensi- 
ties for each spot. The red and green hybridization inten- 
sities were corrected for optical cross talk between the fluo- 
rescein and lissamine channels, using experimentally 
determined coefficients. 
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ABSTRACT cDNA microarray technology is used to profile 
complex diseases and discover novel disease-related genes, in 
inflammatory disease such as rheumatoid arthritis, expression 
patterns of diverse cell types contribute to the pathology. We 
nave monitored gene expression in this disease state with a 
microarray of selected human genes of probable significance in 
inflammation as well as with genes expressed in peripheral 
human blood cells. Messenger RNA from cultured macrophages, 
chondrocyte cell lines, primary chondrocytes, and synoviocytes 
provided expression profiles for the selected cytokines, chemo- 
kines, DNA binding proteins, and matrtc-degrading metal- 
loprotetnases. Comparisons between tissue samples of rheuma- 
toid arthritis and inflammatory bowel disease verified the in* 
volvement of many genes and revealed novel participation of the 
cytokine interleukin 3, chemokine Groa and the metal* 
loproteinase matrix metallo-elastase in both diseases. From the 
peripheral blood library, tissue inhibitor of metalloproteinase 1, 
ferritin light chain, and manganese superoxide dismutase genes 
were identified as expressed differentially in rheumatoid arthri- 
tis compared with inflammatory bowel disease. These results 
successfully demonstrate the use of the cDNA microarray system 
as a general approach for dissecting human diseases. 



The recently described cDNA microarray or DNA-chip tech- 
nology allows expression monitoring of hundreds and thou- 
sands of genes simultaneously and provides a format for 
identifying genes as well as changes in their activity (1, 2). 
Using this technology, two-color fluorescence patterns of 
differential gene expression in the root versus the shoot tissue 
oiArabidbpsis were obtained in a specific array of 48 genes (1). 
In another study using a 1000 gene arTay from a human 
peripheral blood library, novel genes expressed by T cells were 
identified upon heat shock and protein kinase C activation (3). 

The technology uses cDNA sequences or cDNA inserts of a 
library for PCR amplification that are arrayed on a glass slide with 
high speed robotics at a density of 1000 cDNA sequences per cm 3 . 
These microarrays serve as gene targets for hybridization to 
cDNA probes prepared from RNA samples of cells or tissues. A 
two-color fluorescence labeling technique is used in the prepa- 
ration of the cDNA probes such that a simultaneous hybridization 
but separate detection of signals provides the comparative anal- 
ysis and the relative abundance of specific genes expressed (1,2). 
Microarrays can be constructed from specific cDNA clones of 
interest, a cDNA library, or a select number of open reading 
frames from a genome sequencing database to allow a large-scale 
functional analysis of expressed sequences. 
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Because of the wide spectrum of genes and endogenous 
mediators involved, the rnicroarrav technology is well suited 
for analyzing chronic diseases. In rheumatoid "arthritis (RA), 
inflammation of the joint is caused by the gene products of 
many different cell types present in the synovium and cartilage 
tissues plus those infiltrating from the circulating blood. The 
autoimmune and inflammatory nature of the disease is a 
cumulative result of genetic susceptibility factors and multiple 
responses, paracrine and autocrine in nature, from macro- 
phages, T cells, plasma cells, neutrophils, synovial fibroblasts, 
chondrocytes, etc. Growth factors, inflammatory cytokines 
(4), and the chemokines (5) are the important mediators of this 
inflammatory process. The ensuing destruction of the cartilage 
and bone by the invading synovial tissue includes the actions 
of prostaglandins and leukotrienes (6), and the matrix degrad- 
ing metalloproteinases (MMPs). The MMPs are an important 
class of Zn-depcndent metallo-endoproieinases that can col* 
iectivety degrade the proteoglycan and collagen components of 
the connective tissue matrix (7). 

This paper presents a study in which the involvement of 
select classes of molecules in RA was examined. Also inves- 
tigated were 1000 human genes randomly selected from a 
peripheral human blood cell library. Their differential and 
quantitative expression analysis in cells of the joint tissue, in 
diseased RA tissue and in inflammatory bowel disease (IBD) 
tissues was conducted to demonstrate the utility of the mi* 
croarray method to analyze complex diseases by their pattern 
of gene expression. Such a survey provides insight not only into 
the underlying cause of the pathology, but also provides the 
opportunity to selectively target genes for disease intervention 
by appropriate drug development and gene therapies. 

METHODS 

Microarray Design, Development, and Preparation. Two ap- 
proaches for the fabrication of cDNA microarrays were used in 
this study. In the first approach, known human genes of probable 
significance in RA were identified. Regions of the clones, pref- 
erably 1 kb in length, were selected by their proximity to the 3' end 
of the cDNA and for areas of least identity to related and 
repetitive sequences. Primers were synthesized to amplify the 
target regions by standard PCR protocols (3). Products were 



Abbreviations: RA. rheumatoid arthritis; MMP, matru-degrading 
metalloproteinase; IBD. inflammatory bowel disease; LPS, lipopoly- 
sacchahde; PMA, phorbol 12-myristate 13-acetate; TNF-a. tumor 
necrosis factor a; IL. interleukin; TGF-0, transforming growth factor 
3; CCSF, granulocyte colony-stimulating factor; MIP, macrophage 
inflammatory protein; M1F. migration inhibitory factor; HME, human 
matrix metallo-elastase; R ANTES, regulated upon activation, normal 
T cell expressed and secreted; Gel, getatinase; VCAM. vascular cell 
adhesion molecule; ICE IL-1 converting enzyme; PUMP, putative 
metalloproteinase; MnSOD, manganese superoxide dismutase; TIMP, 
tissue inhibitor of metalloproteinase; MCP, macrophage chcmotactic 
protein. 
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verified by gel electrophoresis and purified with Qiaquick 96-well 
purification kit (Oiagert Chatsworth, CA). ryophilizcd (Savant), 
and resuspended in 5 *J of 3 x standard saline citrate (SSC) buffer 
for arraying. In the second approach, the microarray containing 
the 1056 human genes from the peripheral blood lymphocyte 
library was prepared as described (3). 

Tissue Specimens. Rheumatoid synovial tissue was obtained 
from patients with late stage classic RA undergoing remedial 
synovectomy or arthroplasty of the knee. Synovial tissue was 
separated from any associated connective tissue or fat. One 
gram of each synovial specimen was subjected to RNA extrac- 
tion within 40 min of surgical excision, or explants were 
cultured in serum-free medium to examine any changes under 
in vitro conditions. For IBD. specimens of macroscopically 
inflamed lower intestinal mucosa were obtained from patients 
with Crohn disease undergoing remedial surgery. The hyper- 
trophied mucosa] tissue was separated from underlying con- 
nective tissue and extracted for RNA. 

Cultured Cells. The Mono Mac-6 (MM6) monocytic cells 
(8) were grown in RPM1 medium. Human chondrosarcoma 
S W1353 cells, primary human chondrocytes, and synoviocytes 
(9, 10) were cultured in DMEM; all culture media were 
supplemented with 10% fetal bovine serum. 100 Mg/mi strep- 
tomycin, and 500 units/ml penicillin. Treatment of cells with 
lipopotysaccharide (LPS) endotoxin at 30 ng/mi, phorbol 
12-myristate 13-acetate (PMA) at 50 ng/ml, tumor necrosis 
factor a (TNF-a) at 50 ng/ml. interleukin (IL)-10 at 30 ng/ml, 
or transforming growth factor-0 (TGF-0) at 100 ng/ml is 
described in the figure legends. 



Fluorescent Probe, Hyforidizatioa and Scanning. Isolation of 
mRNA. probe preparation, and quantitation with Ambtdopsu 
control mRNAs was essentially as described (3) except for the 
following minor modification. Following the reverse transcriptase 
step, the appropriate CV3* and Cv5-labeJed samples were pooled: 
mRNA degraded by heating the sample to 65*C for 10 min with 
the addition of 5 of 03M NaOH plus 0J ml of 10 mM EDTA. 
The pooled cDNA was purified from unincorporated nucleotides 
by gel filtration in Centri-spin columns (Princeton Separations, 
Adelphia. NJ). Samples were ryophilized and dissolved in 6 of 
hybridization buffer (5x SSC plus 02K SDS). Hybridizations, 
washes, scanning, quantitation procedures, and pseudocolor rep- 
resentations of fluorescent images have been described (3). Scans 
for the two fluorescent probes were normalized either to the 
fluorescence intensity of Ambidopsis mRNAs spiked into the 
labeling reactions (sec Figs. 2-4) or to the signal intensity of 
0-actin and glyceraldehyde-3-phosphate dehydrogenase 
(GAPDH; see Fig. 5). 

RESULTS 

Ninety*Slx-Gene Microarray Design. The actions of cytokines, 
growth factor*, chemokines, transcription factors. MMPs, pros- 
taglandins, and leukotrienes are well recognized in inflammatory • 
disease, particularly RA (11-14). Fig. 1 displays the selected genes 
for this study and also includes control cDNAs of housekeeping 
genes such as 0-actin and GAPDH and genes from Ambidopsis 
for signal normalization and quantitation (row A. columns 1*12). 

Defining Microarray Assay Conditions. Different lengths and 
concentrations of target DNA were tested by arraying PCR- 



j vKATi ;■; 

j:\HAT1 i\ 



HATt j: 

- — : 1 ■ 



HJLTZZ « 

• . -1 
HAT22 J 



8 9 10 

HA72E2 i YES33 j YESZ3 
; HAT22 ^:*YES23 | YES23 * 



L «--1a 



►THFA.1 



S&onvl 



,GeWV 



CJUN RFRA1 
c-fun Bat Fra-l 



GCSF J i 



. GHCSF ' THFB.1j CHEL NFKB50 NFKB65.1 
: QMCSF ! ■ * * TNFp j c-rel NF«Bp50 NFcBpGS 



. THFBI.1i 1 ! TNFHL1 * TNFfllli j NFKB6S2 1KB CREB2 



Stron>2 * 



TNFm ?f TTOnlj NFcBp65 IxB 



TMPt J i 



TOW 
TGFt. 



VTTaaF2?j; 



TCFB 



TMP3 

TIIM" 



POGFB 
PDGFf* 



ICAM1 

;r.AU » 


VCAM 


5L0.1 


CPUA2.2 
cH A? 


CAtCTN 


GMT 


GRO ' 




.C&Jdtofwt 


v OHO 


GROta ' 




ALU.1 

IM0 


ALU* 
TNFR070 


ALU3 

IL-10 


POLYA 
LOCA 



MCP-1 



HCP1.1 

. MCP-1 



MIP-1a 



BfPIB 



.0F 
MF 



RAKTES 

RANTES^ 



- < A. thaliana controls 



Human controls 



[] Cytokines and related genes 

Transcription factors and related genes 
T] MMPs and related genes 



£3 Chemokines .... 

M Growth factp^and related genes 

| Other genes 



Fig. I. Ninety-six-element microirray design. The target element name and the corresponding gene are shown in the layout. Some genes have 
more than one target element to guarantee specificity of signal. For TNF the targets represent decreasing lengths of 1. 0.8, 0.6. 0.4. and 02 kb from 

left to right. 



2152 Biochemistry: Heller ct ai 



Proc. Natl, Acad. Set. USA 94 (1997) 



amplified products ranging from 02 to 12 kb at concentrations 
of 1 jig/fil or less. No significant difference in the signal levels was 
observed within this range of target size and only with 02-kb 
length was a signal reduced upon an 8-fold dilution of the 1 m£/m1 
sample (data not shown). In this study the average length of the 
targets was 1 kb, with a few exceptions in the range of -300 bp, 
arrayed, at a concentration of 1 vg/nl Normally one PCR pro- 
vided sufficient material to fabricate up to 1000 miooarray targets. 

In considering positional effects in the development of the 
targets for the microarrays. selection was biased toward the 3' 
proximal regions, because the signal was reduced if the target 
fragment was biased toward the 5' end (data not shown). This 
result was anticipated since the hybridizing probe is prepared by 
reverse transcription with oligo(dT>primcd mRNA and is richer 
in 3' proximal sequences. Cross-hybridizations of probes to 
targets of a gene family were analyzed with the matrix metal- 



loproteinases as the example because they can show reeions of 
sequence identities of greater than 709c. With collagenasc-I 
(CoU ) and coliagenase-2 (CoI-2) genes as targets with up to 709 
sequence identity, and stromeiysin-1 (Strom-1 ) and stromelysinC 
(Strom-2) genes with different degrees of identiry. our results 
showed thai a short region of overlap, even with 70-90* se- 
quence identity, produced a low level of cross-hybridization. 
However, shorter regions of identity spread over the length of the 
target resulted in cross-hybridization (data not shown). For 
closely related genes, targets were designed by avoiding long 
stretches of homology. For members of a gene family rwo or more 
target regions were included to discriminate between specificity 
of signal versus cros>hybridization. 

Monitoring Differential Expression in Cultured Cell Lines. In 
RA tissue, the monocyte/macrophage population plays a prom- 
inent role in phagocytic and immunomodulatory activities. Typ- 
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Fig. 2. Time course for LPS/PM A-induced MM6 cells. Array elements are described in Fig. 1. (A ) Pseudocolor representations of fluorescent 
scans correspond to gene expression levels at each lime point. The array is made up of 8 Ambtdopsis control targets and 86 human cDNA targets, 
the majority of which are genes with known or suspected involvement in inflammation. The color bars provide a comparative calibration scale 
between arrays and are derived from the Ambidopsts mRNA samples that are introduced in equal amounts during probe preparation. Fluorescent 
probes were made by labeling mRNA from untreated MM6 cells or LPS and PMA treated cells. mRNA was isolated at indicated times after 
induction. (B /-///) The two-color samples were cohybridaed. and microarray scans provided the data for the levels of select transcripts at different 
time points relative to abundance at time aero. The analysis was performed using normalized data collected from 8-bit images. 
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ically these cells, when triggered by an immunogen, produce the 
proinftammairoy cytokines TNF and IL-1. We have used the 
monocyte cell line MM6 and monitored changes in gene expres- 
sion upon activation with LPS endotoxin, a component of Gram- 
negative bacterial membranes, and PMA. which augments the 
action of LPS on TNF production (15). RNA was isolated at 
different times after induction and used for cDNA probe prep- 
aration. From this time course it was clear that TNF expression 
was induced within 15 min of treatment, reached maximum levels 
in 1 hr. remained high until 4 hr and subsequently declined (Fig. 
24). Many other cytokine genes were also transiently activated, 
such as IL-1 o and -0, IL*6. and granulocyte colony-stimulating 
factor (GCSF). Prominent chemokines activated were IL-8, mac- 
rophage inflammatory protein (MIP)-10, more so than MIP-la, 
and Groa or melanoma growth stimulatory factor. Migration 
inhibitory factor (MIF) expressed in the uninduced state declined 
in LPS-acuvaied cells. Of the immediate early genes, the notice- 
able ones were c-fos,fm-L c->un, NF-«Bp50, and I«B. with c-/r/ 
expression observed even in the uninduced state (Fig. 2B). These 
expression patterns are consistent with reported patterns of 
activation of certain LPS- and PMA-induced genes (12). Dem- 
onstrated here is the unique ability of this system to allow parallel 
visualization of a large number of gene activities over a period of 
time. 

SW1353 cells is a line derived from malignant tumors of the 
cartilage and behaves much like the chondrocytes upon stim- 
ulation with TNF and IL-1 in the expression of MMPs (9). In 
addition to confirming our earlier observations with Northern 
blots on Stronvl, Col-l, and Col-3 expression (9), gelaunase 
(Gel) A, putative metalloproteinase (PUMP)-1 membrane* 
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type matrix metalloproteinase, tissue inhibitors of matrix 
metalloproteinases or tissue inhibitor of metalloproteinase 1 
(T1MP-1 ). -2. and -3 were also expressed by these cells toeether 
with the human matrix metallo-elastase (HME: Fig. 1A ). HME 
induction was estimated to be ~5Mold and was greater than 
any of the other MMPs examined (Fig. 35). This result was 
unexpected because HME is reportedly expressed onrv bv 
alveolar macrophage and placental cells* (16). Expression of 
the cytokines and chemokines. IL-6, IL-8. MIF. and MIP-10 
was also noted. A variety of other genes, including certain 
transcription factors, were also up-regulated (Fig. 3). but the 
overall time-dependent expression of genes in the SW1353 
cells was qualitatively distinct from the MM6 cells. 

Quantitation of differential gene expression (Figs. 2B and 
IB) was achieved with the simultaneous hybridization of 
Cy3-labeled cDNA from untreated cells and Cv5-labeled 
cDNA from treated samples. The estimated increases in 
expression from these microan-avs for a select number of genes 
including IL-10, IL-8, MIP-10. TNF. HME. CoM, Col-3. 
Stronvl, and Strom-2 were compared with data collected from 
dot blot analysis. Results (not shown) were in close agreement 
and confirmed our earlier observations on the use of the 
microarray method for the quantitation ofgene expression (3). 

Expression Profiles in Primary Chondrocytes and SvnovioJ 
eytes of Human RA Tissue. Given the sensitivity and the 
specificity of this method, expression profiles of primary 
synoviocytes and chondrocytes from diseased tissue were 
examined. Without prior exposure to inducing agents, low level 
expression of c-yun, GCSF. IL-3, TNF-0. MIF. and RANTES 
(regulated upon activation, normal T cell expressed and se- 
creted) was seen as well as expression of MMPs, GelA, 
Stronvl, Col-l, and the three TIMPs. In this case, Col-2 
hybridization was considered to be nonspecific because the 
second Col-2 target taken from the 3' end of the gene gave no 
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Fig. 3. Time course for IL-10 and TNF-induced SW1353 cells 
using the inflammation array (Fig. 1). (A) Pseudocolor representation 

i" 0 /^?!^*? 4 corTe *P° nd to * cnc expression levels at each time 
point. (B/-/I0 Relative levels of selected genes at different time points 
compared with time zero. 



1 

FiG. 4. Expression profiles for early passage primary synoviocytes and 
chondrocytes isolated from RA tissue, cultured in the presence of 10% 
fetal calf serum and activated with PMA and IL-10. or TNF and IL-1& 
or TGF-0 for 18 hr. The color bars provide a comparative calibration scale 
between arrays and are derived from the Arabutopsu mRN A samples that 
ire introduced in equal amounts during probe preparation 
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signal. Treatment more so with PMA and IL-1. than TNF and 
IL-1, produced a dramatic up-regulation in expression of 
several genes in both of these primary cell types. These genes 
are as follows: the cytokine IL-6, the chemokines IL-8 and 
Gro-lo, and the MMPs; Strom-1, CoU. Col-3, and HME: and 
the adhesion molecule, vascular cell adhesion molecule 1 
(VCAM-1). The surprise again is HME expression in these 
primary cells, for reasons discussed above. From these results 
the expression profiles of synoviocytes and the chondrocytes 
appear very similar the differences are more quantitative than 
qualitative. Treatment of the primary chondrocytes with the 
anabolic growth factor TGF-0 had an interesting profile in that 
it produced a remarkable down-regulation of genes expressed 
in both the untreated and induced state (Fig. 4). 

Given the demonstrated effectiveness of this technology, a 
comparative analysis of two different inflammatory disease 
states was conducted with probes made from RA tissue and 
IBD samples. RA samples were from late stage rheumatoid 
synovial tissue, and IBD specimens were obtained from in- 
flamed lower intestinal mucosa of patients with Crohn disease. 
With both the 96-element known gene microarTay and the 
1000-gene microarray of cDNAs selected from a peripheral 
human blood cell library (3), distinct differences in gene 
expression patterns were evident. On the 96-gcne array, RA 
tissue samples from different affected individuals gave similar 
profiles (data not shown) as did different samples from the 
same individual (Fig. 5). These patterns were notably similar 
to those observed with primary synoviocytes and chondrocytes 
(Fig. 4). Included in the list of prominently up-regulated genes 
are IL-6, the MMPs Strom-1, CoU. GclA. HME, and in 
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Fig. 5. Expression profiles of RA tissue (A) and IBD tissue (B). 
mRNA from RA tissue samples obtained from the same individual was 
isolated directly after excision (RA 21.5A) or maintained in culture 
without serum for 2 hr (RA 21JB) or for 6 hr (RA 21SC). Profiles 
from tissue samples of two other individuals (data not shown) were 
remarkably similar to the ones shown here. IBD-A and IBD-CI are 
from mRNA samples prepared directly after surgery from two sepa- 
rate individuals. For the IBD-CI1 probe, the tissue sample was cultured 
in medium without serum for 2 hr before mRNA preparation 
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™^Th P ,L* P"*** T1MP.1 and 

71MP-3, and the adhesion molecule VCAM. Discernible levels 

2\525 P ge cnenjotaci * protein 1 (MCP-1), MIF and 
R ANTES were also noted. IBD samples were in comparison, 
rather subdued although IL-1 convening enzvme (ICE). 
TlMP-1. and MIF were notable in all the three different IBD 
samp es examined here. In IBD-A. one of three individual 
samples, ICE, VCAM, Groo, and MMP expression was more 
pronounced than in the others. 

We also made use of a peripheral blood cDNA library (3) 
to identify genes expressed by rymphoevtes infiltrating the 
inflamed tissues from the circulating blood. With the 1046- 
element array of randomly selected cDNAs from this library 
probes made from RA and IBD samples showed hvbridiia lions 
to a large number of genes. Of these, many were common 
between the two disease tissues while others were differentially 
expressed (data not shown). A complete survey of these genes 
was beyond the scope of this study, but for this report we 
picked three genes that were up-regulated in the RA tissue 
relative to IBD. These cDNAs were sequenced and identified 
by comparison to the GenBank database. They arc TIMP-1, 

?ff 221" ,,ghl chain ' and "^nsanese superoxide dismutasc 
(MnSOD). Differential expression of MnSOD was only ob- 
served in samples of R A tissue expiants maintained in growth 
medium without serum for anywhere between 2 to 16 hr. These 
results also indicate that the expression profile of genes can be 
altered when expiants are transferred to culture conditions, 

DISCUSSION 

The speed, ease, and feasibility of simultaneously monitoring 
differential expression of hundreds of genes with the cDNA 
microarray based system (1-3) is demonstrated here in the 
analysis of a complex disease such as RA. Many different cell 
types in the RA tissue; macrophages, lymphocytes, plasma cells, 
neutrophils, synoviocytes, chondrocytes, etc are known to con- 
tribute to the development of the disease with the expression of 
gene products known to be proinflammatory. They include the 
cytokines, chemokines, growth factors, MMPs, eicosanoids, and 
others (7, 11-14), and the design of the 96-element known gene 
microarray was based on this knowledge and depended on the 
availability of the genes. The technology was validated by con- 
firming earlier observations on the expression of TNF by the 
monocyte cell line MM6, and of Col- 1 and Col-3 expression in the 
chondrosarcoma cells and articular chondrocytes (9, 12), In our 
time-dependent survey the chronological order of gene activities 
in and between gene families was compared and the results have 
provided unprecedented profiles of the cytokines (TNF, IL-1. 
IL-6. GCSF. and MIF), chemokines (MlP-la, MIP-10, IL-8, and 
Gro-1), certain transcription factors, and the matrix mctal- 
loproteinases (GelA, Strom-1. CoM, Col-3, HME) in the mac- 
rophage cell line MM6 and in the SW1353 chondrosarcoma cells. 
Earlier reports of cytokine production in the diseased state had 
established a model in which TNF is a major participant in RA. 
Its expression reportedly preceded thai of the other cytokines and 
effector molecules (4). Our results strongly support these results 
as demonstrated in the time course of the MM6 ceils where TNF 
induction preceded thai of IL-la and IL-0 followed by II>6 and 
GCSF. These expression profiles demonstrate the utility of the 
microarrays in determining the hierarachy of signaling events. 

In the SW1353 chondrosarcoma cells, all the known MMPs and 
TIMPs were examined simultaneously. HME expression was 
discovered, which previously had been observed in only the 
stromal cells and alveolar macrophages of smoker's lungs and in 
placental tissue. Its presence in cells of the RA tissue is mean- 
ingful because its* activity can cause significant destruction of 
elastin and basement membrane components (16, 17). Expression 
profiles of synovial fibroblasts and articular chondrocytes were 
remarkably similar and not too different from the SW1353 cells, 
indicating that the fibroblast and the chondrocyte can play equally 
aggressive roles in joint erosion. Prominent genes expressed were 
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the MMPs, but ehemokines and cytokines were also produced fay 
these cells. The effect of the anabolic growth factor TGF-0 was 
profoundly evident in demonstrating the down regulation of these 
catabolic activities. 

RA tissue samples undeniably reflected profiles similar to 
the cell types examined. Active genes observed were IL-3. IL-6, 
ICE. the MMPs including HME and TIMPt, ehemokines IL-fi, 
Groa. MIP, MIF, and R ANTES, and the adhesion molecule 
VCAM. Of the growth factors, fibroblast growth factor 0 was 
observed most frequently. In comparison, the expression 
patterns in the other inflammatory state (i.e., IBD) were not 
as marked as in the RA samples, at least as obtained from the 
tissue samples selected for this study. 

As an alternative approach, the 1046 cDNA microanay of 
randomly selected genes from a lymphocyte library was used to 
identify genes ex p r e s sed in RA tissue (3). Many genes on this 
array hybridized with probes made from both RA and IBD tissue 
samples. The results are not surprising because inflammatory 
tissue is abundantly supplied with cell types infiltrating from the 
circulating blood, made apparent also by the high levels of 
chemokine expression in RA tissue. Because of the magnitude of 
the effort required to identify all the hybridized genes, we have for 
this report chosen to describe only three differentially expressed 
genes mainly to verify this method of analysis. 

Of the large number of genes observed here, a fair number 
were already known as active participants in inflammatory dis- 
ease. These are TNF. IL-1, IL-6, IL-8, GCSF, R ANTES, and 
VCAM. The novel participants not previously reported are 
HME IL-3. ICE and Groa. With our discovery of HME 
expression in RA. this gene becomes a target for drug interven- 
tion. ICE is a cysteine protease well known for its IL-10 process- 
ing activity (18), and recognized for its role in apoptouc cell death 
(19). Its expression in RA tissue is intriguing. IL-3 is recognized 
for its growth-promoting activity in hematopoietic cell lineages, is 
a product of activated T cells (20), and its expression in synovio- 
cytes and chondrocytes of RA tissue is a novel observation. 

Like 11^8, Groa, is a C-X-C subgroup chemokine and is a 
potent neutrophil and basophil chemoattractant. It down- 
regulates the expression of types I and III interstitial collagens 
(21. 22) and is seen here produced by the MM6 cells, in primary 
synoviocytes, and in RA tissue. With the presence of R ANTES, 
MCP, and MIP- 13, the C-C ehemokines (23) migration and 
infiltration of monocytes, particularly T cells, into the tissue is 
also enhanced (5) and aid in the trafficking and recruitment of 
leukocytes into the RA tissue. Their activation, phagocytosis, 
degranulation. and respiratory bursts could be responsible for 
the induction of MnSOD in RA. MnSOD is also induced by 
TNF and IL-1 and serves a protective function against oxida- 
tive damage. The induction of the ferritin light chain encoding 
gene in this tissue may be for reasons similar to those for 
MnSOD. Ferritin is the major intracellular iron storage protein 
and it is responsive to intracellular oxidative stress and reactive 
oxygen intermediates generated during inflammation (24, 25). 
The active expression of TIMP-1 in RA tissue, as detected by 
the 1000-element array, is no surprise because our results have 
repeatedly shown TIMP-1 to be expressed in the constitutive 
and induced states of RA cells and tissues. 

The suitability of the cDNA microanay technology for 
profiling diseases and for identifying disease related genes is 
well documented here. This technology could provide new 



targets for drug development and disease therapies, and in 
doing so allow for improved treatment of chronic diseases that 
are challenging because of their complexity. 
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